pyspark.pandas.read_sql_query#

pyspark.pandas.read_sql_query(sql, con, index_col=None, **options)[source]#

Read SQL query into a DataFrame.

Returns a DataFrame corresponding to the result set of the query string. Optionally provide an index_col parameter to use one of the columns as the index, otherwise default index will be used.

Note

Some database might hit the issue of Spark: SPARK-27596

Parameters

sqlstring SQL query: SQL query to be executed.
constr: A JDBC URI could be provided as str.

Note

The URI must be JDBC URI instead of Python’s database URI.
index_colstring or list of strings, optional, default: None: Column(s) to set as index(MultiIndex).
optionsdict: All other options passed directly into Spark’s JDBC data source.

Returns

DataFrame