Releases: googleapis/python-bigquery-pandas
Releases · googleapis/python-bigquery-pandas
Version 0.13.0
- Raise NotImplementedError when the deprecated
private_key
argument is used. (#301)
Version 0.12.0
New features
- Add
max_results
argument topandas_gbq.read_gbq()
. Use this
argument to limit the number of rows in the results DataFrame. Set
max_results
to 0 to ignore query outputs, such as for DML or DDL
queries. (#102) - Add
progress_bar_type
argument topandas_gbq.read_gbq()
. Use
this argument to display a progress bar when downloading data.
(#182)
Dependency updates
- Update the minimum version of
google-cloud-bigquery
to 1.11.1.
(#296)
Documentation
- Add code samples to introduction and refactor how-to guides. (#239)
Bug fixes
- Fix resource leak with
use_bqstorage_api
by closing BigQuery Storage API client after use. (#294)
Version 0.11.0
- Breaking Change: Python 2 support has been dropped. This is to align
with the pandas package which dropped Python 2 support at the end of 2019.
(#268)
Enhancements
- Ensure
table_schema
argument is not modified inplace. (:issue:278
)
Implementation changes
- Use object dtype for
STRING
,ARRAY
, andSTRUCT
columns when
there are zero rows. (#285)
Internal changes
Version 0.10.0
Documentation
- Document BigQuery data type to pandas dtype conversion for
read_gbq
. ( #269 )
Dependency updates
- Update the minimum version of
google-cloud-bigquery
to 1.9.0. ( #247 ) - Update the minimum version of
pandas
to 0.19.0. ( #262 )
Internal changes
- Update the authentication credentials. Note: You may need to set
reauth=True
in order to update your credentials to the most recent version. This is required to use new functionality such as the BigQuery Storage API. ( #267 ) - Use
to_dataframe()
fromgoogle-cloud-bigquery
in theread_gbq()
function. ( #247 )
Enhancements
- Fix a bug where pandas-gbq could not upload an empty DataFrame. ( #237 )
- Allow table_schema in
to_gbq
to contain only a subset of columns, with the rest being populated using the DataFrame dtypes ( #218 ) (contributed by @JohnPaton) - Read
project_id
into_gbq
from provided credentials if available (contributed by @daureg) read_gbq
uses the timezone-awareDatetimeTZDtype(unit='ns', tz='UTC')
dtype for BigQueryTIMESTAMP
columns. ( #269 )- Add
use_bqstorage_api
toread_gbq
. The BigQuery Storage API can be used to download large query results (>125 MB) more quickly. If the BQ Storage API can't be used, the BigQuery API is used instead. ( #133, #270 )
Version 0.9.0
Version 0.8.0
Breaking changes
- Deprecate private_key parameter to
pandas_gbq.read_gbq
andpandas_gbq.to_gbq
in favor of new credentials argument. Instead, create a credentials object usinggoogle.oauth2.service_account.Credentials.from_service_account_info
orgoogle.oauth2.service_account.Credentials.from_service_account_file
. See the authentication how-to guide for examples. (#161, #231 )
Enhancements
- Allow newlines in data passed to to_gbq. (#180)
- Add
pandas_gbq.context.dialect
to allow overriding the default SQL syntax dialect. (#195, #235) - Support Python 3.7. (#197, #232)
Internal changes
Version 0.7.0
int
columns which containNULL
are now cast tofloat
, rather thanobject
type. (#174)DATE
,DATETIME
andTIMESTAMP
columns are now parsed as pandas'timestamp
objects (#224)- Add :class:
pandas_gbq.Context
to cache credentials in-memory, across calls toread_gbq
andto_gbq
. (#198, #208) - Fast queries now do not log above
DEBUG
level. (#204) With BigQuery's release of clustering querying smaller samples of data is now faster and cheaper. - Don't load credentials from disk if reauth is
True
. (#212) This fixes a bug where pandas-gbq could not refresh credentials if the cached credentials were invalid, revoked, or expired, even whenreauth=True
. - Catch RefreshError when trying credentials. (#226)
Version 0.6.1
- Improved
read_gbq
performance and memory consumption by delegating DataFrame construction to the Pandas library, radically reducing the number of loops that execute in python (#128) - Reduced verbosity of logging from
read_gbq
, particularly for short queries. (#201) - Avoid
SELECT 1
query when runningto_gbq
. (#202)
Version 0.6.0
Version 0.5.0
- Project ID parameter is optional in read_gbq and to_gbq when it can inferred from the environment. Note: you must still pass in a project ID when using user-based authentication. (#103)
- Progress bar added for to_gbq, through an optional library tqdm as dependency. (#162)
- Add location parameter to read_gbq and to_gbq so that pandas-gbq can work with datasets in the Tokyo region. (#177)