records_mover.records.targets package
Module contents
- class records_mover.records.targets.RecordsTargets(url_resolver, db_driver)
Bases:
object
These methods produce objects representing the target of a records move. The objects can be used as the ‘target’ argument to
records_mover.records.move()
This object should be pulled from the ‘targets’ property of the ‘records’ property on a
records_mover.Session
object instead of being constructed directly.Example use:
records = session.records db_engine = session.get_default_db_engine() url = 's3://some-bucket/some-directory/' source = records.sources.directory_from_url(url=url) target = records.targets.table(schema_name='myschema', table_name='mytable', db_engine=db_engine) results = records.move(source, target)
- Parameters
url_resolver (UrlResolver) –
db_driver (Callable[[Optional[Union[Engine, Connection]], Optional[Connection], Optional[Engine]], DBDriver]) –
- directory_from_url(output_url, records_format=None)
Represents a Records Directory pointed to by a URL as a target.
- Parameters
output_url (str) – Location to write the records directory. Must be a URL format understood by the records_mover.url library, and must be a directory URL that ends with a ‘/’.
records_format (Optional[BaseRecordsFormat]) – Description of the format of the data files to write out. If not specified, an efficient format for bulk moves will be chosen.
- Return type
DirectoryFromUrlRecordsTarget
- table(db_engine, schema_name, table_name, existing_table_handling=ExistingTableHandling.DELETE_AND_OVERWRITE, drop_and_recreate_on_load_error=False, add_user_perms_for=None, add_group_perms_for=None, db_conn=None)
Represents a SQLALchemy-accessible database table as as a target.
- Parameters
db_engine (Engine) – SQLAlchemy database engine to write data to.
schema_name (str) – Schema name of a table to write data to.
table_name (str) – Table name of a table to write data to.
existing_table_handling (ExistingTableHandling) – When loading into a database table, controls how any existing table found will be handled. This must be a
records_mover.records.ExistingTableHandling
object.drop_and_recreate_on_load_error (bool) – If True, table load errors will attempt to be addressed by dropping the target table and reloading the incoming data.
add_user_perms_for (Optional[Dict[str, List[str]]]) – If specified, a table’s permissions will be set for the specified users. Format should be like {‘all’: [‘username1’, ‘username2’], ‘select’: [‘username3’, ‘username4’]}
add_group_perms_for (Optional[Dict[str, List[str]]]) – If specified, a table’s permissions will be set for the specified group. Format should be like {‘all’: [‘group1’, ‘group2’], ‘select’: [‘group3’, ‘group4’]}
db_conn (Optional[Connection]) – SQLAlchemy database connection to write data to. If not specified, one will be created from the db_engine.
- Return type
TableRecordsTarget
- google_sheet(spreadsheet_id, sheet_name, google_cloud_creds)
Represents a sheet in a Google Sheets spreadsheet as a target, via the Google Sheets API.
- Parameters
spreadsheet_id (str) – This is the xyz in https://docs.google.com/spreadsheets/d/xyz/edit?ts=5be5b383#gid=abc
sheet_name (str) – This is the label of the particular tab within the Google Sheets spreadsheet where the data should go.
google_cloud_creds (google.auth.credentials.Credentials) – Credentials object for Google Cloud Platform access.
- Return type
GoogleSheetsRecordsTarget
- fileobj(output_fileobj, records_format)
Represents a stream of data files bytes as a target.
- Parameters
output_fileobj (IO[bytes]) – Stream where the file shoud be written to.
records_format (BaseRecordsFormat) – Description of the format of the data files to write out. If not specified, an efficient format for bulk moves will be chosen.
- Return type
FileobjTarget
- data_url(output_url, records_format=None)
Represents a URL pointer to a data file as a target.
- Parameters
output_url (str) – Location of the data file to write. Must be a URL format understood by the records_mover.url library corresponding to a file, not a directory (i.e., not ending with a ‘/’)
records_format (Optional[BaseRecordsFormat]) – Description of the format of the data files to write out. If not specified, an efficient format for bulk moves will be chosen.
- Return type
DataUrlTarget
- local_file(filename, records_format=None)
Represents a data file on the local filesystem as a target.
- Parameters
filename (str) – File path (relative or absolute) of the data file to unload to.
records_format (Optional[BaseRecordsFormat]) – Description of the format of the data files to write out. If not specified, an efficient format for bulk moves will be chosen.
- Return type
DataUrlTarget
- spectrum(schema_name, table_name, db_engine, spectrum_base_url=None, spectrum_rdir_url=None, existing_table_handling=ExistingTableHandling.TRUNCATE_AND_OVERWRITE)
Represents a location in Amazon Redshift Spectrum as a target.
- Parameters
schema_name (str) – Schema name of a table to write data to.
table_name (str) – Table name of a table to write data to.
db_engine (Engine) – SQLAlchemy database engine to write data to.
spectrum_base_url (Optional[str]) – Root S3 URL under which a simple directory structure will be created for files to be stored, if spectrum_rdir_url is not specified. Note that when using the mover CLI, db-facts may be used to provide a default.
spectrum_rdir_url (Optional[str]) – S3 URL where a records directory with files will be stored; otherwise, use db-facts default if exists. If this is not specified, spectrum_base_url must be.
existing_table_handling (ExistingTableHandling) – When loading into a database table, controls how any existing table found will be handled. This must be a
records_mover.records.ExistingTableHandling
object.
- Return type
SpectrumRecordsTarget