community.aws.glue_crawler (5.1.0) — module

Manage an AWS Glue crawler

| "added in version" 4.1.0 of community.aws"

Authors: Ivan Chekaldin (@ichekaldin)

Install collection

Install with ansible-galaxy collection install community.aws:==5.1.0


Add to requirements.yml

  collections:
    - name: community.aws
      version: 5.1.0

Description

Manage an AWS Glue crawler. See U(https://aws.amazon.com/glue/) for details.

Prior to release 5.0.0 this module was called C(community.aws.aws_glue_crawler). The usage did not change.


Requirements

Usage examples

  • Success
    Steampunk Spotter scan finished with no errors, warnings or hints.
# Note: These examples do not set authentication details, see the AWS Guide for details.

# Create an AWS Glue crawler
- community.aws.glue_crawler:
    name: my-glue-crawler
    database_name: my_database
    role: my-iam-role
    schema_change_policy:
      delete_behavior: DELETE_FROM_DATABASE
      update_behavior: UPDATE_IN_DATABASE
    recrawl_policy:
      recrawl_ehavior: CRAWL_EVERYTHING
    targets:
      S3Targets:
        - Path: "s3://my-bucket/prefix/folder/"
          ConnectionName: my-connection
          Exclusions:
            - "**.json"
            - "**.yml"
    state: present
  • Success
    Steampunk Spotter scan finished with no errors, warnings or hints.
# Delete an AWS Glue crawler
- community.aws.glue_crawler:
    name: my-glue-crawler
    state: absent

Inputs

    
name:
    description:
    - The name you assign to this crawler definition. It must be unique in your account.
    required: true
    type: str

role:
    description:
    - The name or ARN of the IAM role associated with this crawler.
    - Required when I(state=present).
    type: str

tags:
    aliases:
    - resource_tags
    description:
    - A dictionary representing the tags to be applied to the resource.
    - If the I(tags) parameter is not set then tags will not be modified.
    required: false
    type: dict

state:
    choices:
    - present
    - absent
    description:
    - Create or delete the AWS Glue crawler.
    required: true
    type: str

region:
    aliases:
    - aws_region
    - ec2_region
    description:
    - The AWS region to use.
    - For global services such as IAM, Route53 and CloudFront, I(region) is ignored.
    - The C(AWS_REGION) or C(EC2_REGION) environment variables may also be used.
    - See the Amazon AWS documentation for more information U(http://docs.aws.amazon.com/general/latest/gr/rande.html#ec2_region).
    - The C(ec2_region) alias has been deprecated and will be removed in a release after
      2024-12-01
    - Support for the C(EC2_REGION) environment variable has been deprecated and will
      be removed in a release after 2024-12-01.
    type: str

profile:
    aliases:
    - aws_profile
    description:
    - A named AWS profile to use for authentication.
    - See the AWS documentation for more information about named profiles U(https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-profiles.html).
    - The C(AWS_PROFILE) environment variable may also be used.
    - The I(profile) option is mutually exclusive with the I(aws_access_key), I(aws_secret_key)
      and I(security_token) options.
    type: str

targets:
    description:
    - A list of targets to crawl. See example below.
    - Required when I(state=present).
    type: dict

access_key:
    aliases:
    - aws_access_key_id
    - aws_access_key
    - ec2_access_key
    description:
    - AWS access key ID.
    - See the AWS documentation for more information about access tokens U(https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html#access-keys-and-secret-access-keys).
    - The C(AWS_ACCESS_KEY_ID), C(AWS_ACCESS_KEY) or C(EC2_ACCESS_KEY) environment variables
      may also be used in decreasing order of preference.
    - The I(aws_access_key) and I(profile) options are mutually exclusive.
    - The I(aws_access_key_id) alias was added in release 5.1.0 for consistency with the
      AWS botocore SDK.
    - The I(ec2_access_key) alias has been deprecated and will be removed in a release
      after 2024-12-01.
    - Support for the C(EC2_ACCESS_KEY) environment variable has been deprecated and will
      be removed in a release after 2024-12-01.
    type: str

aws_config:
    description:
    - A dictionary to modify the botocore configuration.
    - Parameters can be found in the AWS documentation U(https://botocore.amazonaws.com/v1/documentation/api/latest/reference/config.html#botocore.config.Config).
    type: dict

purge_tags:
    default: true
    description:
    - If I(purge_tags=true) and I(tags) is set, existing tags will be purged from the
      resource to match exactly what is defined by I(tags) parameter.
    - If the I(tags) parameter is not set then tags will not be modified, even if I(purge_tags=True).
    - Tag keys beginning with C(aws:) are reserved by Amazon and can not be modified.  As
      such they will be ignored for the purposes of the I(purge_tags) parameter.  See
      the Amazon documentation for more information U(https://docs.aws.amazon.com/general/latest/gr/aws_tagging.html#tag-conventions).
    required: false
    type: bool

secret_key:
    aliases:
    - aws_secret_access_key
    - aws_secret_key
    - ec2_secret_key
    description:
    - AWS secret access key.
    - See the AWS documentation for more information about access tokens U(https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html#access-keys-and-secret-access-keys).
    - The C(AWS_SECRET_ACCESS_KEY), C(AWS_SECRET_KEY), or C(EC2_SECRET_KEY) environment
      variables may also be used in decreasing order of preference.
    - The I(secret_key) and I(profile) options are mutually exclusive.
    - The I(aws_secret_access_key) alias was added in release 5.1.0 for consistency with
      the AWS botocore SDK.
    - The I(ec2_secret_key) alias has been deprecated and will be removed in a release
      after 2024-12-01.
    - Support for the C(EC2_SECRET_KEY) environment variable has been deprecated and will
      be removed in a release after 2024-12-01.
    type: str

description:
    description:
    - Description of the crawler being defined.
    type: str

endpoint_url:
    aliases:
    - ec2_url
    - aws_endpoint_url
    - s3_url
    description:
    - URL to connect to instead of the default AWS endpoints.  While this can be used
      to connection to other AWS-compatible services the amazon.aws and community.aws
      collections are only tested against AWS.
    - The  C(AWS_URL) or C(EC2_URL) environment variables may also be used, in decreasing
      order of preference.
    - The I(ec2_url) and I(s3_url) aliases have been deprecated and will be removed in
      a release after 2024-12-01.
    - Support for the C(EC2_URL) environment variable has been deprecated and will be
      removed in a release after 2024-12-01.
    type: str

table_prefix:
    description:
    - The table prefix used for catalog tables that are created.
    type: str

aws_ca_bundle:
    description:
    - The location of a CA Bundle to use when validating SSL certificates.
    - The C(AWS_CA_BUNDLE) environment variable may also be used.
    type: path

database_name:
    description:
    - The name of the database where results are written.
    type: str

session_token:
    aliases:
    - aws_session_token
    - security_token
    - aws_security_token
    - access_token
    description:
    - AWS STS session token for use with temporary credentials.
    - See the AWS documentation for more information about access tokens U(https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html#access-keys-and-secret-access-keys).
    - The C(AWS_SESSION_TOKEN), C(AWS_SECURITY_TOKEN) or C(EC2_SECURITY_TOKEN) environment
      variables may also be used in decreasing order of preference.
    - The I(security_token) and I(profile) options are mutually exclusive.
    - Aliases I(aws_session_token) and I(session_token) were added in release 3.2.0, with
      the parameter being renamed from I(security_token) to I(session_token) in release
      6.0.0.
    - The I(security_token), I(aws_security_token), and I(access_token) aliases have been
      deprecated and will be removed in a release after 2024-12-01.
    - Support for the C(EC2_SECRET_KEY) and C(AWS_SECURITY_TOKEN) environment variables
      has been deprecated and will be removed in a release after 2024-12-01.
    type: str

recrawl_policy:
    description:
    - A policy that specifies whether to crawl the entire dataset again, or to crawl only
      folders that were added since the last crawler run.
    suboptions:
      recrawl_behavior:
        description:
        - Specifies whether to crawl the entire dataset again or to crawl only folders
          that were added since the last crawler run.
        - Supported options are C(CRAWL_EVERYTHING) and C(CRAWL_NEW_FOLDERS_ONLY).
        type: str
    type: dict

validate_certs:
    default: true
    description:
    - When set to C(false), SSL certificates will not be validated for communication with
      the AWS APIs.
    - Setting I(validate_certs=false) is strongly discouraged, as an alternative, consider
      setting I(aws_ca_bundle) instead.
    type: bool

schema_change_policy:
    description:
    - The policy for the crawler's update and deletion behavior.
    suboptions:
      delete_behavior:
        description:
        - Defines the deletion behavior when the crawler finds a deleted object.
        - Supported options are C(LOG), C(DELETE_FROM_DATABASE), and C(DEPRECATE_IN_DATABASE).
        type: str
      update_behavior:
        description:
        - Defines the update behavior when the crawler finds a changed schema..
        - Supported options are C(LOG) and C(UPDATE_IN_DATABASE).
        type: str
    type: dict

debug_botocore_endpoint_logs:
    default: false
    description:
    - Use a C(botocore.endpoint) logger to parse the unique (rather than total) C("resource:action")
      API calls made during a task, outputing the set to the resource_actions key in the
      task results. Use the C(aws_resource_action) callback to output to total list made
      during a playbook.
    - The C(ANSIBLE_DEBUG_BOTOCORE_LOGS) environment variable may also be used.
    type: bool

Outputs

creation_time:
  description: The time and date that this crawler definition was created.
  returned: when state is present
  sample: '2021-04-01T05:19:58.326000+00:00'
  type: str
database_name:
  description: The name of the database where results are written.
  returned: when state is present
  sample: my_table
  type: str
description:
  description: Description of the crawler.
  returned: when state is present
  sample: My crawler
  type: str
last_updated:
  description: The time and date that this crawler definition was last updated.
  returned: when state is present
  sample: '2021-04-01T05:19:58.326000+00:00'
  type: str
name:
  description: The name of the AWS Glue crawler.
  returned: always
  sample: my-glue-crawler
  type: str
recrawl_policy:
  contains:
    RecrawlBehavior:
      description: Whether to crawl the entire dataset again or to crawl only folders
        that were added since the last crawler run.
      returned: when state is present
      sample: CRAWL_EVERYTHING
      type: str
  description: A policy that specifies whether to crawl the entire dataset again,
    or to crawl only folders that were added since the last crawler run.
  returned: when state is present
  type: complex
role:
  description: The name or ARN of the IAM role associated with this crawler.
  returned: when state is present
  sample: my-iam-role
  type: str
schema_change_policy:
  contains:
    DeleteBehavior:
      description: The deletion behavior when the crawler finds a deleted object.
      returned: when state is present
      sample: DELETE_FROM_DATABASE
      type: str
    UpdateBehavior:
      description: The update behavior when the crawler finds a changed schema.
      returned: when state is present
      sample: UPDATE_IN_DATABASE
      type: str
  description: The policy for the crawler's update and deletion behavior.
  returned: when state is present
  type: complex
table_prefix:
  description: The table prefix used for catalog tables that are created.
  returned: when state is present
  sample: my_prefix
  type: str
targets:
  contains:
    CatalogTargets:
      description: List of catalog targets.
      returned: when state is present
      type: list
    DynamoDBTargets:
      description: List of DynamoDB targets.
      returned: when state is present
      type: list
    JdbcTargets:
      description: List of JDBC targets.
      returned: when state is present
      type: list
    MongoDBTargets:
      description: List of Mongo DB targets.
      returned: when state is present
      type: list
    S3Targets:
      description: List of S3 targets.
      returned: when state is present
      type: list
  description: A list of targets to crawl.
  returned: when state is present
  type: complex