使用 AWS SDK 的 AWS Glue API 代码示例 - AWS Glue

使用 AWS SDK 的 AWS Glue API 代码示例

以下代码示例显示如何将 AWS Glue 与 AWS 软件开发工具包(SDK)一起使用。

基础知识是向您展示如何在服务中执行基本操作的代码示例。

操作是大型程序的代码摘录,必须在上下文中运行。您可以通过操作了解如何调用单个服务函数,还可以通过函数相关场景的上下文查看操作。

有关 AWS SDK 开发人员指南和代码示例的完整列表,请参阅 将此服务与 AWS SDK 结合使用。本主题还包括有关入门的信息以及有关先前的 SDK 版本的详细信息。

开始使用

以下代码示例显示如何开始使用 AWS Glue。

.NET
AWS SDK for .NET
注意

在 GitHub 上查看更多内容。查找完整示例,学习如何在 AWS 代码示例存储库中进行设置和运行。

namespace GlueActions; public class HelloGlue { private static ILogger logger = null!; static async Task Main(string[] args) { // Set up dependency injection for AWS Glue. using var host = Host.CreateDefaultBuilder(args) .ConfigureLogging(logging => logging.AddFilter("System", LogLevel.Debug) .AddFilter<DebugLoggerProvider>("Microsoft", LogLevel.Information) .AddFilter<ConsoleLoggerProvider>("Microsoft", LogLevel.Trace)) .ConfigureServices((_, services) => services.AddAWSService<IAmazonGlue>() .AddTransient<GlueWrapper>() ) .Build(); logger = LoggerFactory.Create(builder => { builder.AddConsole(); }) .CreateLogger<HelloGlue>(); var glueClient = host.Services.GetRequiredService<IAmazonGlue>(); var request = new ListJobsRequest(); var jobNames = new List<string>(); do { var response = await glueClient.ListJobsAsync(request); jobNames.AddRange(response.JobNames); request.NextToken = response.NextToken; } while (request.NextToken is not null); Console.Clear(); Console.WriteLine("Hello, Glue. Let's list your existing Glue Jobs:"); if (jobNames.Count == 0) { Console.WriteLine("You don't have any AWS Glue jobs."); } else { jobNames.ForEach(Console.WriteLine); } } }
  • 有关 API 的详细信息,请参阅 AWS SDK for .NET API 参考中的 ListJobs

C++
SDK for C++
注意

查看 GitHub,了解更多信息。查找完整示例,学习如何在 AWS 代码示例存储库中进行设置和运行。

CMakeLists.txt CMake 文件的代码。

# Set the minimum required version of CMake for this project. cmake_minimum_required(VERSION 3.13) # Set the AWS service components used by this project. set(SERVICE_COMPONENTS glue) # Set this project's name. project("hello_glue") # Set the C++ standard to use to build this target. # At least C++ 11 is required for the AWS SDK for C++. set(CMAKE_CXX_STANDARD 11) # Use the MSVC variable to determine if this is a Windows build. set(WINDOWS_BUILD ${MSVC}) if (WINDOWS_BUILD) # Set the location where CMake can find the installed libraries for the AWS SDK. string(REPLACE ";" "/aws-cpp-sdk-all;" SYSTEM_MODULE_PATH "${CMAKE_SYSTEM_PREFIX_PATH}/aws-cpp-sdk-all") list(APPEND CMAKE_PREFIX_PATH ${SYSTEM_MODULE_PATH}) endif () # Find the AWS SDK for C++ package. find_package(AWSSDK REQUIRED COMPONENTS ${SERVICE_COMPONENTS}) if (WINDOWS_BUILD AND AWSSDK_INSTALL_AS_SHARED_LIBS) # Copy relevant AWS SDK for C++ libraries into the current binary directory for running and debugging. # set(BIN_SUB_DIR "/Debug") # if you are building from the command line you may need to uncomment this # and set the proper subdirectory to the executables' location. AWSSDK_CPY_DYN_LIBS(SERVICE_COMPONENTS "" ${CMAKE_CURRENT_BINARY_DIR}${BIN_SUB_DIR}) endif () add_executable(${PROJECT_NAME} hello_glue.cpp) target_link_libraries(${PROJECT_NAME} ${AWSSDK_LINK_LIBRARIES})

hello_glue.cpp 源文件的代码。

#include <aws/core/Aws.h> #include <aws/glue/GlueClient.h> #include <aws/glue/model/ListJobsRequest.h> #include <iostream> /* * A "Hello Glue" starter application which initializes an AWS Glue client and lists the * AWS Glue job definitions. * * main function * * Usage: 'hello_glue' * */ int main(int argc, char **argv) { Aws::SDKOptions options; // Optionally change the log level for debugging. // options.loggingOptions.logLevel = Utils::Logging::LogLevel::Debug; Aws::InitAPI(options); // Should only be called once. int result = 0; { Aws::Client::ClientConfiguration clientConfig; // Optional: Set to the AWS Region (overrides config file). // clientConfig.region = "us-east-1"; Aws::Glue::GlueClient glueClient(clientConfig); std::vector<Aws::String> jobs; Aws::String nextToken; // Used for pagination. do { Aws::Glue::Model::ListJobsRequest listJobsRequest; if (!nextToken.empty()) { listJobsRequest.SetNextToken(nextToken); } Aws::Glue::Model::ListJobsOutcome listRunsOutcome = glueClient.ListJobs( listJobsRequest); if (listRunsOutcome.IsSuccess()) { const std::vector<Aws::String> &jobNames = listRunsOutcome.GetResult().GetJobNames(); jobs.insert(jobs.end(), jobNames.begin(), jobNames.end()); nextToken = listRunsOutcome.GetResult().GetNextToken(); } else { std::cerr << "Error listing jobs. " << listRunsOutcome.GetError().GetMessage() << std::endl; result = 1; break; } } while (!nextToken.empty()); std::cout << "Your account has " << jobs.size() << " jobs." << std::endl; for (size_t i = 0; i < jobs.size(); ++i) { std::cout << " " << i + 1 << ". " << jobs[i] << std::endl; } } Aws::ShutdownAPI(options); // Should only be called once. return result; }
  • 有关 API 的详细信息,请参阅 AWS SDK for C++ API 参考中的 ListJobs

Java
SDK for Java 2.x
注意

查看 GitHub,了解更多信息。查找完整示例,学习如何在 AWS 代码示例存储库中进行设置和运行。

package com.example.glue; import software.amazon.awssdk.regions.Region; import software.amazon.awssdk.services.glue.GlueClient; import software.amazon.awssdk.services.glue.model.ListJobsRequest; import software.amazon.awssdk.services.glue.model.ListJobsResponse; import java.util.List; public class HelloGlue { public static void main(String[] args) { GlueClient glueClient = GlueClient.builder() .region(Region.US_EAST_1) .build(); listJobs(glueClient); } public static void listJobs(GlueClient glueClient) { ListJobsRequest request = ListJobsRequest.builder() .maxResults(10) .build(); ListJobsResponse response = glueClient.listJobs(request); List<String> jobList = response.jobNames(); jobList.forEach(job -> { System.out.println("Job Name: " + job); }); } }
  • 有关 API 的详细信息,请参阅 AWS SDK for Java 2.x API 参考中的 ListJobs

JavaScript
SDK for JavaScript (v3)
注意

查看 GitHub,了解更多信息。查找完整示例,学习如何在 AWS 代码示例存储库中进行设置和运行。

import { ListJobsCommand, GlueClient } from "@aws-sdk/client-glue"; const client = new GlueClient({}); export const main = async () => { const command = new ListJobsCommand({}); const { JobNames } = await client.send(command); const formattedJobNames = JobNames.join("\n"); console.log("Job names: "); console.log(formattedJobNames); return JobNames; };
  • 有关 API 详细信息,请参阅《AWS SDK for JavaScript API 参考》中的 ListJobs

Python
SDK for Python (Boto3)
注意

查看 GitHub,了解更多信息。查找完整示例,学习如何在 AWS 代码示例存储库中进行设置和运行。

import boto3 from botocore.exceptions import ClientError def hello_glue(): """ Lists the job definitions in your AWS Glue account, using the AWS SDK for Python (Boto3). """ try: # Create the Glue client glue = boto3.client("glue") # List the jobs, limiting the results to 10 per page paginator = glue.get_paginator("get_jobs") response_iterator = paginator.paginate( PaginationConfig={"MaxItems": 10, "PageSize": 10} ) # Print the job names print("Here are the jobs in your account:") for page in response_iterator: for job in page["Jobs"]: print(f"\t{job['Name']}") except ClientError as e: print(f"Error: {e}") if __name__ == "__main__": hello_glue()
  • 有关 API 详细信息,请参阅《AWS SDK for Python (Boto3) API 参考》中的 ListJobs

Ruby
适用于 Ruby 的 SDK
注意

查看 GitHub,了解更多信息。查找完整示例,学习如何在 AWS 代码示例存储库中进行设置和运行。

require 'aws-sdk-glue' require 'logger' # GlueManager is a class responsible for managing AWS Glue operations # such as listing all Glue jobs in the current AWS account. class GlueManager def initialize(client) @client = client @logger = Logger.new($stdout) end # Lists and prints all Glue jobs in the current AWS account. def list_jobs @logger.info('Here are the Glue jobs in your account:') paginator = @client.get_jobs(max_results: 10) jobs = [] paginator.each_page do |page| jobs.concat(page.jobs) end if jobs.empty? @logger.info("You don't have any Glue jobs.") else jobs.each do |job| @logger.info("- #{job.name}") end end end end if $PROGRAM_NAME == __FILE__ glue_client = Aws::Glue::Client.new manager = GlueManager.new(glue_client) manager.list_jobs end
  • 有关 API 的详细信息,请参阅 AWS SDK for Ruby API 参考中的 ListJobs

Rust
适用于 Rust 的 SDK
注意

查看 GitHub,了解更多信息。查找完整示例,学习如何在 AWS 代码示例存储库中进行设置和运行。

let mut list_jobs = glue.list_jobs().into_paginator().send(); while let Some(list_jobs_output) = list_jobs.next().await { match list_jobs_output { Ok(list_jobs) => { let names = list_jobs.job_names(); info!(?names, "Found these jobs") } Err(err) => return Err(GlueMvpError::from_glue_sdk(err)), } }
  • 有关 API 详细信息,请参阅《AWS SDK for Rust API 参考》中的 ListJobs