Skip to content

Releases: googleapis/python-bigquery-pandas

Version 0.13.0

12 Dec 22:49
2897b81
Compare
Choose a tag to compare
  • Raise NotImplementedError when the deprecated private_key argument is used. (#301)

Version 0.12.0

25 Nov 22:22
9fb2464
Compare
Choose a tag to compare

New features

  • Add max_results argument to pandas_gbq.read_gbq(). Use this
    argument to limit the number of rows in the results DataFrame. Set
    max_results to 0 to ignore query outputs, such as for DML or DDL
    queries. (#102)
  • Add progress_bar_type argument to pandas_gbq.read_gbq(). Use
    this argument to display a progress bar when downloading data.
    (#182)

Dependency updates

  • Update the minimum version of google-cloud-bigquery to 1.11.1.
    (#296)

Documentation

  • Add code samples to introduction and refactor how-to guides. (#239)

Bug fixes

  • Fix resource leak with use_bqstorage_api by closing BigQuery Storage API client after use. (#294)

Release on PyPI

Version 0.11.0

29 Jul 20:06
9990047
Compare
Choose a tag to compare
  • Breaking Change: Python 2 support has been dropped. This is to align
    with the pandas package which dropped Python 2 support at the end of 2019.
    (#268)

Enhancements

  • Ensure table_schema argument is not modified inplace. (:issue:278)

Implementation changes

  • Use object dtype for STRING, ARRAY, and STRUCT columns when
    there are zero rows. (#285)

Internal changes

  • Populate user-agent with pandas version information. (#281)
  • Fix pytest.raises usage for latest pytest. Fix warnings in tests.
    (#282 )
  • Update CI to install nightly packages in the conda tests. (#254)

Version 0.10.0

05 Apr 13:37
f633fa9
Compare
Choose a tag to compare

Documentation

Dependency updates

  • Update the minimum version of google-cloud-bigquery to 1.9.0. ( #247 )
  • Update the minimum version of pandas to 0.19.0. ( #262 )

Internal changes

  • Update the authentication credentials. Note: You may need to set reauth=True in order to update your credentials to the most recent version. This is required to use new functionality such as the BigQuery Storage API. ( #267 )
  • Use to_dataframe() from google-cloud-bigquery in the read_gbq() function. ( #247 )

Enhancements

  • Fix a bug where pandas-gbq could not upload an empty DataFrame. ( #237 )
  • Allow table_schema in to_gbq to contain only a subset of columns, with the rest being populated using the DataFrame dtypes ( #218 ) (contributed by @JohnPaton)
  • Read project_id in to_gbq from provided credentials if available (contributed by @daureg)
  • read_gbq uses the timezone-aware DatetimeTZDtype(unit='ns', tz='UTC') dtype for BigQuery TIMESTAMP columns. ( #269 )
  • Add use_bqstorage_api to read_gbq. The BigQuery Storage API can be used to download large query results (>125 MB) more quickly. If the BQ Storage API can't be used, the BigQuery API is used instead. ( #133, #270 )

Version 0.9.0

11 Jan 17:55
b0254c4
Compare
Choose a tag to compare
  • Warn when deprecated private_key parameter is used. #240
  • New dependency Use the pydata-google-auth package for authentication. #241

PyPI

Version 0.8.0

12 Nov 09:13
398e75e
Compare
Choose a tag to compare

Breaking changes

  • Deprecate private_key parameter to pandas_gbq.read_gbq and pandas_gbq.to_gbq in favor of new credentials argument. Instead, create a credentials object using google.oauth2.service_account.Credentials.from_service_account_info or google.oauth2.service_account.Credentials.from_service_account_file. See the authentication how-to guide for examples. (#161, #231 )

Enhancements

  • Allow newlines in data passed to to_gbq. (#180)
  • Add pandas_gbq.context.dialect to allow overriding the default SQL syntax dialect. (#195, #235)
  • Support Python 3.7. (#197, #232)

Internal changes

  • Migrate tests to CircleCI. (#228, #232)

Version 0.7.0

19 Oct 17:37
5d0346a
Compare
Choose a tag to compare
  • int columns which contain NULL are now cast to float, rather than object type. (#174)
  • DATE, DATETIME and TIMESTAMP columns are now parsed as pandas' timestamp objects (#224)
  • Add :class:pandas_gbq.Context to cache credentials in-memory, across calls to read_gbq and to_gbq. (#198, #208)
  • Fast queries now do not log above DEBUG level. (#204) With BigQuery's release of clustering querying smaller samples of data is now faster and cheaper.
  • Don't load credentials from disk if reauth is True. (#212) This fixes a bug where pandas-gbq could not refresh credentials if the cached credentials were invalid, revoked, or expired, even when reauth=True.
  • Catch RefreshError when trying credentials. (#226)

Version 0.6.1

04 Sep 16:34
e0f0b2f
Compare
Choose a tag to compare
  • Improved read_gbq performance and memory consumption by delegating DataFrame construction to the Pandas library, radically reducing the number of loops that execute in python (#128)
  • Reduced verbosity of logging from read_gbq, particularly for short queries. (#201)
  • Avoid SELECT 1 query when running to_gbq. (#202)

Version 0.6.0

21 Aug 17:39
d6b7507
Compare
Choose a tag to compare
  • Warn when dialect is not passed in to read_gbq. The default dialect
    will be changing from 'legacy' to 'standard' in a future version.
    (#195 )
  • Use general float with 15 decimal digit precision when writing to local
    CSV buffer in to_gbq. This prevents numerical overflow in certain
    edge cases. (#192)

Version 0.5.0

25 Jun 17:11
ade32a2
Compare
Choose a tag to compare
  • Project ID parameter is optional in read_gbq and to_gbq when it can inferred from the environment. Note: you must still pass in a project ID when using user-based authentication. (#103)
  • Progress bar added for to_gbq, through an optional library tqdm as dependency. (#162)
  • Add location parameter to read_gbq and to_gbq so that pandas-gbq can work with datasets in the Tokyo region. (#177)