

# Configure source settings
<a name="configure-source"></a>

You can configure the source settings based on the source that you choose to send information to a Firehose stream from console. You can configure source settings for Amazon MSK and Amazon Kinesis Data Streams as the source. There are no source settings available for Direct PUT as the source.

# Configure source settings for Amazon MSK
<a name="writing-with-msk"></a>

When you choose Amazon MSK to send information to a Firehose stream, you can choose between MSK provisioned and MSK-Serverless clusters. You can then use Firehose to read data easily from a specific Amazon MSK cluster and topic and load it into the specified S3 destination.

In the **Source settings** section of the page, provide values for the following fields.

****Amazon MSK cluster connectivity****  
Choose either the **Private bootstrap brokers** (recommended) or **Public bootstrap brokers** option based on your cluster configuration. Bootstrap brokers is what Apache Kafka client uses as a starting point to connect to the cluster. Public bootstrap brokers are intended for public access from outside of AWS, while private bootstrap brokers are intended for access from within AWS. For more information about Amazon MSK, see [Amazon Managed Streaming for Apache Kafka](https://docs.aws.amazon.com/msk/latest/developerguide/what-is-msk.html).   
To connect to a provisioned or serverless Amazon MSK cluster through private bootstrap brokers, the cluster must meet all of the following requirements.  
+ The cluster must be active.
+ The cluster must have IAM as one of its access control methods.
+ Multi-VPC private connectivity must be enabled for the IAM access control method.
+ You must add to this cluster a resource-based policy which grants Firehose service principal the permission to invoke the Amazon MSK `CreateVpcConnection` API operation.
To connect to a provisioned Amazon MSK cluster through public bootstrap brokers, the cluster must meet all of the following requirements.  
+ The cluster must be active.
+ The cluster must have IAM as one of its access control methods.
+ The cluster must be public-accessible.

****MSK cluster account****  
You can choose the account where the Amazon MSK cluster resides. This can be one of the following.  
+ **Current account** – Allows you to ingest data from an MSK cluster in the current AWS account. For this, you must specify the ARN of the Amazon MSK cluster from where your Firehose stream will read data.
+ **Cross-account** – Allows you to ingest data from an MSK cluster in another AWS account. For more information, see [Cross-account delivery from Amazon MSK](controlling-access.md#cross-account-delivery-msk).

****Topic****  
Specify the Apache Kafka topic from which you want your Firehose stream to ingest data. You cannot update this topic after Firehose stream creation completes.  
Firehose automatically decompresses Apache Kafka messages.

# Configure source settings for Amazon Kinesis Data Streams
<a name="writing-with-kinesis-streams"></a>

Configure the source settings for Amazon Kinesis Data Streams to send information to a Firehose stream as following.

**Important**  
If you use the Kinesis Producer Library (KPL) to write data to a Kinesis data stream, you can use aggregation to combine the records that you write to that Kinesis data stream. If you then use that data stream as a source for your Firehose stream, Amazon Data Firehose de-aggregates the records before it delivers them to the destination. If you configure your Firehose stream to transform the data, Amazon Data Firehose de-aggregates the records before it delivers them to AWS Lambda. For more information, see [Developing Amazon Kinesis Data Streams Producers Using the Kinesis Producer Library](https://docs.aws.amazon.com/streams/latest/dev/developing-producers-with-kpl.html) and [Aggregation](https://docs.aws.amazon.com/streams/latest/dev/kinesis-kpl-concepts.html#kinesis-kpl-concepts-aggretation).

Under the **Source settings**, choose an existing stream in the **Kinesis data stream** list, or enter a data stream ARN in the format `arn:aws:kinesis:[Region]:[AccountId]:stream/[StreamName]`.

If you do not have an existing data stream then choose **Create** to create a new one from Amazon Kinesis console. You may need an IAM role that has the necessary permission on the Kinesis stream. For more information, see [Grant Firehose access to an Amazon S3 destination](controlling-access.md#using-iam-s3). After you create a new stream, choose the refresh icon to update the **Kinesis stream** list. If you have a large number of streams, filter the list using **Filter by name**. 

**Note**  
When you configure a Kinesis data stream as the source of a Firehose stream, the Amazon Data Firehose `PutRecord` and `PutRecordBatch` operations are disabled. To add data to your Firehose stream in this case, use the Kinesis Data Streams `PutRecord` and `PutRecords` operations.

Amazon Data Firehose starts reading data from the `LATEST` position of your Kinesis stream. For more information about Kinesis Data Streams positions, see [GetShardIterator](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_GetShardIterator.html).

 Amazon Data Firehose calls the Kinesis Data Streams [GetRecords](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_GetRecords.html) operation once per second for each shard. However, when full backup is enabled, Firehose calls the Kinesis Data Streams `GetRecords` operation twice per second for each shard, one for primary delivery destination and another for full backup.

More than one Firehose stream can read from the same Kinesis stream. Other Kinesis applications (consumers) can also read from the same stream. Each call from any Firehose stream or other consumer application counts against the overall throttling limit for the shard. To avoid getting throttled, plan your applications carefully. For more information about Kinesis Data Streams limits, see [Amazon Kinesis Streams Limits](https://docs.aws.amazon.com/streams/latest/dev/service-sizes-and-limits.html). 

Proceed to the next step to configure record transformation and format conversion.