community.aws.aws_glue_job (3.4.0) — module

Manage an AWS Glue job

| "added in version" 1.0.0 of community.aws"

Authors: Rob White (@wimnat), Vijayanand Sharma (@vijayanandsharma)

Install collection

Install with ansible-galaxy collection install community.aws:==3.4.0


Add to requirements.yml

  collections:
    - name: community.aws
      version: 3.4.0

Description

Manage an AWS Glue job. See U(https://aws.amazon.com/glue/) for details.


Requirements

Usage examples

  • Success
    Steampunk Spotter scan finished with no errors, warnings or hints.
# Note: These examples do not set authentication details, see the AWS Guide for details.

# Create an AWS Glue job
- community.aws.aws_glue_job:
    command_script_location: "s3://s3bucket/script.py"
    default_arguments:
      "--extra-py-files": s3://s3bucket/script-package.zip
      "--TempDir": "s3://s3bucket/temp/"
    name: my-glue-job
    role: my-iam-role
    state: present
  • Success
    Steampunk Spotter scan finished with no errors, warnings or hints.
# Delete an AWS Glue job
- community.aws.aws_glue_job:
    name: my-glue-job
    state: absent

Inputs

    
name:
    description:
    - The name you assign to this job definition. It must be unique in your account.
    required: true
    type: str

role:
    description:
    - The name or ARN of the IAM role associated with this job.
    - Required when I(state=present).
    type: str

tags:
    description:
    - A hash/dictionary of tags to be applied to the job.
    - Remove completely or specify an empty dictionary to remove all tags.
    type: dict
    version_added: 2.2.0
    version_added_collection: community.aws

state:
    choices:
    - present
    - absent
    description:
    - Create or delete the AWS Glue job.
    required: true
    type: str

region:
    aliases:
    - aws_region
    - ec2_region
    description:
    - The AWS region to use. If not specified then the value of the AWS_REGION or EC2_REGION
      environment variable, if any, is used. See U(http://docs.aws.amazon.com/general/latest/gr/rande.html#ec2_region)
    type: str

ec2_url:
    aliases:
    - aws_endpoint_url
    - endpoint_url
    description:
    - URL to use to connect to EC2 or your Eucalyptus cloud (by default the module will
      use EC2 endpoints). Ignored for modules where region is required. Must be specified
      for all other modules if region is not used. If not set then the value of the EC2_URL
      environment variable, if any, is used.
    type: str

profile:
    aliases:
    - aws_profile
    description:
    - Using I(profile) will override I(aws_access_key), I(aws_secret_key) and I(security_token)
      and support for passing them at the same time as I(profile) has been deprecated.
    - I(aws_access_key), I(aws_secret_key) and I(security_token) will be made mutually
      exclusive with I(profile) after 2022-06-01.
    type: str

timeout:
    description:
    - The job timeout in minutes.
    type: int

aws_config:
    description:
    - A dictionary to modify the botocore configuration.
    - Parameters can be found at U(https://botocore.amazonaws.com/v1/documentation/api/latest/reference/config.html#botocore.config.Config).
    - Only the 'user_agent' key is used for boto modules. See U(http://boto.cloudhackers.com/en/latest/boto_config_tut.html#boto)
      for more boto configuration.
    type: dict

purge_tags:
    default: true
    description:
    - If C(true), existing tags will be purged from the resource to match exactly what
      is defined by I(tags) parameter.
    - If the I(tags) parameter is not set then tags will not be modified.
    type: bool
    version_added: 2.2.0
    version_added_collection: community.aws

connections:
    description:
    - A list of Glue connections used for this job.
    elements: str
    type: list

description:
    description:
    - Description of the job being defined.
    type: str

max_retries:
    description:
    - The maximum number of times to retry this job if it fails.
    type: int

worker_type:
    choices:
    - Standard
    - G.1X
    - G.2X
    description:
    - The type of predefined worker that is allocated when a job runs.
    type: str
    version_added: 1.5.0
    version_added_collection: community.aws

command_name:
    default: glueetl
    description:
    - The name of the job command. This must be 'glueetl'.
    type: str

glue_version:
    description:
    - Glue version determines the versions of Apache Spark and Python that AWS Glue supports.
    type: str
    version_added: 1.5.0
    version_added_collection: community.aws

aws_ca_bundle:
    description:
    - The location of a CA Bundle to use when validating SSL certificates.
    - Not used by boto 2 based modules.
    - 'Note: The CA Bundle is read ''module'' side and may need to be explicitly copied
      from the controller if not run locally.'
    type: path

aws_access_key:
    aliases:
    - ec2_access_key
    - access_key
    description:
    - C(AWS access key). If not set then the value of the C(AWS_ACCESS_KEY_ID), C(AWS_ACCESS_KEY)
      or C(EC2_ACCESS_KEY) environment variable is used.
    - If I(profile) is set this parameter is ignored.
    - Passing the I(aws_access_key) and I(profile) options at the same time has been deprecated
      and the options will be made mutually exclusive after 2022-06-01.
    type: str

aws_secret_key:
    aliases:
    - ec2_secret_key
    - secret_key
    description:
    - C(AWS secret key). If not set then the value of the C(AWS_SECRET_ACCESS_KEY), C(AWS_SECRET_KEY),
      or C(EC2_SECRET_KEY) environment variable is used.
    - If I(profile) is set this parameter is ignored.
    - Passing the I(aws_secret_key) and I(profile) options at the same time has been deprecated
      and the options will be made mutually exclusive after 2022-06-01.
    type: str

security_token:
    aliases:
    - aws_security_token
    - access_token
    description:
    - C(AWS STS security token). If not set then the value of the C(AWS_SECURITY_TOKEN)
      or C(EC2_SECURITY_TOKEN) environment variable is used.
    - If I(profile) is set this parameter is ignored.
    - Passing the I(security_token) and I(profile) options at the same time has been deprecated
      and the options will be made mutually exclusive after 2022-06-01.
    type: str

validate_certs:
    default: true
    description:
    - When set to "no", SSL certificates will not be validated for communication with
      the AWS APIs.
    type: bool

default_arguments:
    description:
    - A dict of default arguments for this job.  You can specify arguments here that your
      own job-execution script consumes, as well as arguments that AWS Glue itself consumes.
    type: dict

number_of_workers:
    description:
    - The number of workers of a defined workerType that are allocated when a job runs.
    type: int
    version_added: 1.5.0
    version_added_collection: community.aws

allocated_capacity:
    description:
    - The number of AWS Glue data processing units (DPUs) to allocate to this Job. From
      2 to 100 DPUs can be allocated; the default is 10. A DPU is a relative measure of
      processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory.
    type: int

max_concurrent_runs:
    description:
    - The maximum number of concurrent runs allowed for the job. The default is 1. An
      error is returned when this threshold is reached. The maximum value you can specify
      is controlled by a service limit.
    type: int

command_python_version:
    description:
    - Python version being used to execute a Python shell job.
    - AWS currently supports C('2') or C('3').
    type: str
    version_added: 2.2.0
    version_added_collection: community.aws

command_script_location:
    description:
    - The S3 path to a script that executes a job.
    - Required when I(state=present).
    type: str

debug_botocore_endpoint_logs:
    default: 'no'
    description:
    - Use a botocore.endpoint logger to parse the unique (rather than total) "resource:action"
      API calls made during a task, outputing the set to the resource_actions key in the
      task results. Use the aws_resource_action callback to output to total list made
      during a playbook. The ANSIBLE_DEBUG_BOTOCORE_LOGS environment variable may also
      be used.
    type: bool

Outputs

allocated_capacity:
  description: The number of AWS Glue data processing units (DPUs) allocated to runs
    of this job. From 2 to 100 DPUs can be allocated; the default is 10. A DPU is
    a relative measure of processing power that consists of 4 vCPUs of compute capacity
    and 16 GB of memory.
  returned: when state is present
  sample: 10
  type: int
command:
  contains:
    name:
      description: The name of the job command.
      returned: when state is present
      sample: glueetl
      type: str
    python_version:
      description: Specifies the Python version.
      returned: when state is present
      sample: 3
      type: str
    script_location:
      description: Specifies the S3 path to a script that executes a job.
      returned: when state is present
      sample: mybucket/myscript.py
      type: str
  description: The JobCommand that executes this job.
  returned: when state is present
  type: complex
connections:
  description: The connections used for this job.
  returned: when state is present
  sample: '{ Connections: [ ''list'', ''of'', ''connections'' ] }'
  type: dict
created_on:
  description: The time and date that this job definition was created.
  returned: when state is present
  sample: '2018-04-21T05:19:58.326000+00:00'
  type: str
default_arguments:
  description: The default arguments for this job, specified as name-value pairs.
  returned: when state is present
  sample: '{ ''mykey1'': ''myvalue1'' }'
  type: dict
description:
  description: Description of the job being defined.
  returned: when state is present
  sample: My first Glue job
  type: str
execution_property:
  contains:
    max_concurrent_runs:
      description: The maximum number of concurrent runs allowed for the job. The
        default is 1. An error is returned when this threshold is reached. The maximum
        value you can specify is controlled by a service limit.
      returned: when state is present
      sample: 1
      type: int
  description: An ExecutionProperty specifying the maximum number of concurrent runs
    allowed for this job.
  returned: always
  type: complex
glue_version:
  description: Glue version.
  returned: when state is present
  sample: 2.0
  type: str
job_name:
  description: The name of the AWS Glue job.
  returned: always
  sample: my-glue-job
  type: str
last_modified_on:
  description: The last point in time when this job definition was modified.
  returned: when state is present
  sample: '2018-04-21T05:19:58.326000+00:00'
  type: str
max_retries:
  description: The maximum number of times to retry this job after a JobRun fails.
  returned: when state is present
  sample: 5
  type: int
name:
  description: The name assigned to this job definition.
  returned: when state is present
  sample: my-glue-job
  type: str
role:
  description: The name or ARN of the IAM role associated with this job.
  returned: when state is present
  sample: my-iam-role
  type: str
timeout:
  description: The job timeout in minutes.
  returned: when state is present
  sample: 300
  type: int