FormatCase class - AWS Glue

FormatCase class

The FormatCase transform changes each string in a column to the specified case type.

Example

from pyspark.context import SparkContext from pyspark.sql import SparkSession from awsgluedi.transforms import * sc = SparkContext() spark = SparkSession(sc) datasource1 = spark.read.json("s3://${BUCKET}/json/zips/raw/data") try: df_output = data_cleaning.FormatCase.apply( data_frame=datasource1, spark_context=sc, source_column="city", case_type="LOWER" ) except: print("Unexpected Error happened ") raise

Output

The FormatCase transformation will convert the values in the `city` column to lowercase based on the `case_type="LOWER"` parameter. The resulting `df_output` DataFrame will contain all columns from the original `datasource1` DataFrame, but with the `city` column values in lowercase.

Methods

__call__(spark_context, data_frame, source_column, case_type)

The FormatCase transform changes each string in a column to the specified case type.

  • source_column – The name of an existing column.

  • case_type – Supported case types are CAPITAL,LOWER, UPPER, SENTENCE.

apply(cls, *args, **kwargs)

Inherited from GlueTransform apply.

name(cls)

Inherited from GlueTransform name.

describeArgs(cls)

Inherited from GlueTransform describeArgs.

describeReturn(cls)

Inherited from GlueTransform describeReturn.

describeTransform(cls)

Inherited from GlueTransform describeTransform.

describeErrors(cls)

Inherited from GlueTransform describeErrors.

describe(cls)

Inherited from GlueTransform describe.