BUG: Cannot construct StringArray from pyarrow.ChunkedArray with zero chunks · Issue #41040 · pandas-dev/pandas · GitHub | Latest TMZ Celebrity News & Gossip | Watch TMZ Live
Skip to content

BUG: Cannot construct StringArray from pyarrow.ChunkedArray with zero chunks #41040

Closed
@xhochy

Description

@xhochy
  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

Code Sample, a copy-pastable example

import pyarrow as pa
import pandas as pd

df = pd.DataFrame({"string": pd.array([], dtype="string")})
table = pa.Table.from_pandas(df)
print(table.column(0))
# <pyarrow.lib.ChunkedArray object at 0x12c5cec70>
# [
#   []
# ]

# Construct a Table with zero chunks as returned by `pyarrow.dataset`
empty_table = pa.table([pa.chunked_array([], type=pa.string())], schema=table.schema)
print(empty_table.column(0))
# <pyarrow.lib.ChunkedArray object at 0x12c5cee00>
# [
# 
# ]

empty_table.to_pandas()

yields the following traceback:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-19-5047304e95a2> in <module>
----> 1 empty_table.to_pandas()

~/mambaforge/envs/fletcher/lib/python3.8/site-packages/pyarrow/array.pxi in pyarrow.lib._PandasConvertible.to_pandas()

~/mambaforge/envs/fletcher/lib/python3.8/site-packages/pyarrow/table.pxi in pyarrow.lib.Table._to_pandas()

~/mambaforge/envs/fletcher/lib/python3.8/site-packages/pyarrow/pandas_compat.py in table_to_blockmanager(options, table, categories, ignore_metadata, types_mapper)
    790     _check_data_column_metadata_consistency(all_columns)
    791     columns = _deserialize_column_index(table, all_columns, column_indexes)
--> 792     blocks = _table_to_blocks(options, table, categories, ext_columns_dtypes)
    793 
    794     axes = [columns, index]

~/mambaforge/envs/fletcher/lib/python3.8/site-packages/pyarrow/pandas_compat.py in _table_to_blocks(options, block_table, categories, extension_columns)
   1131     result = pa.lib.table_to_blocks(options, block_table, categories,
   1132                                     list(extension_columns.keys()))
-> 1133     return [_reconstruct_block(item, columns, extension_columns)
   1134             for item in result]
   1135 

~/mambaforge/envs/fletcher/lib/python3.8/site-packages/pyarrow/pandas_compat.py in <listcomp>(.0)
   1131     result = pa.lib.table_to_blocks(options, block_table, categories,
   1132                                     list(extension_columns.keys()))
-> 1133     return [_reconstruct_block(item, columns, extension_columns)
   1134             for item in result]
   1135 

~/mambaforge/envs/fletcher/lib/python3.8/site-packages/pyarrow/pandas_compat.py in _reconstruct_block(item, columns, extension_columns)
    749             raise ValueError("This column does not support to be converted "
    750                              "to a pandas ExtensionArray")
--> 751         pd_ext_arr = pandas_dtype.__from_arrow__(arr)
    752         block = _int.make_block(pd_ext_arr, placement=placement,
    753                                 klass=_int.ExtensionBlock)

~/Development/pandas/pandas/core/arrays/string_.py in __from_arrow__(self, array)
    119             results.append(str_arr)
    120 
--> 121         return StringArray._concat_same_type(results)
    122 
    123 

~/Development/pandas/pandas/core/arrays/_mixins.py in _concat_same_type(cls, to_concat, axis)
    240         dtypes = {str(x.dtype) for x in to_concat}
    241         if len(dtypes) != 1:
--> 242             raise ValueError("to_concat must have the same dtype (tz)", dtypes)
    243 
    244         new_values = [x._ndarray for x in to_concat]

ValueError: ('to_concat must have the same dtype (tz)', set())

Problem description

The problem here is that we are checking in _concat_same_type whether we concat arrays of the same type but as we don't concatenate anything, there are also no types to check. This should instead create an empty array.

Metadata

Metadata

Assignees

Labels

BugStringsString extension data type and string data

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions

    TMZ Celebrity News – Breaking Stories, Videos & Gossip

    Looking for the latest TMZ celebrity news? You've come to the right place. From shocking Hollywood scandals to exclusive videos, TMZ delivers it all in real time.

    Whether it’s a red carpet slip-up, a viral paparazzi moment, or a legal drama involving your favorite stars, TMZ news is always first to break the story. Stay in the loop with daily updates, insider tips, and jaw-dropping photos.

    🎥 Watch TMZ Live

    TMZ Live brings you daily celebrity news and interviews straight from the TMZ newsroom. Don’t miss a beat—watch now and see what’s trending in Hollywood.