

# Nova Forge SDK
<a name="nova-forge-sdk"></a>

The Nova Forge SDK is a comprehensive Python SDK for customizing Amazon Nova models. The SDK provides a unified interface for training, evaluation, monitoring, deployment, and inference of Amazon Nova models across different platforms including SageMaker AI and Amazon Bedrock. Whether you're adapting models to domain-specific tasks or optimizing performance for your use case, this SDK provides everything you need in one unified interface.

## Benefits
<a name="nova-forge-sdk-why-choose"></a>
+ One SDK for the entire model customization lifecycle—from data preparation to deployment and monitoring.
+ Support for multiple training methods including continued pre-training (CPT), supervised fine-tuning (SFT), direct preference optimization (DPO), and reinforcement fine-tuning (RFT), both single-turn and multi-turn, with both LoRA and full-rank approaches.
+ Built-in support for SageMaker Training Jobs, SageMaker HyperPod, and Amazon Bedrock, with automatic resource management.
+ No more finding the right recipes or container URI for your training techniques.
+ Bring your own training recipes or use the SDK's intelligent defaults with parameter overrides.
+ The SDK validates your configuration against supported model and instance combinations and provides validation support, preventing errors before training starts.
+ Integrated Amazon CloudWatch monitoring enables you to track training progress in real-time.
+ Integrated MLFlow to track training experiments with SageMaker AI MLFlow tracking servers.

## Requirements
<a name="nova-forge-sdk-requirements"></a>

**Supported Python Versions**

Nova Forge SDK is tested on:
+ Python 3.12

## Installation
<a name="nova-forge-sdk-installation"></a>

To install this SDK, please follow below command.

```
pip install amzn-nova-forge
```

## Supported Models and Techniques
<a name="nova-forge-sdk-supported-models"></a>

The SDK supports the following models and techniques within the Amazon Nova family:


****  

| Method | Supported Models | 
| --- | --- | 
| Continued Pre-training | [All Nova Models](https://docs.aws.amazon.com/sagemaker/latest/dg/nova-model-recipes.html#nova-model-recipes-reference) (SMHP only) | 
| Supervised Fine-tuning LoRA | [All Nova Models](https://docs.aws.amazon.com/sagemaker/latest/dg/nova-model-recipes.html#nova-model-recipes-reference) | 
| Supervised Fine-tuning Full-Rank | [All Nova Models](https://docs.aws.amazon.com/sagemaker/latest/dg/nova-model-recipes.html#nova-model-recipes-reference) (SMHP and SMTJ only) | 
| Direct Preference Optimization LoRA | Nova 1.0 models (SMHP and SMTJ only) | 
| Direct Preference Optimization Full-Rank | Nova 1.0 models (SMHP and SMTJ only) | 
| Reinforcement Fine-tuning LoRA | Nova Lite 2.0 | 
| Reinforcement Fine-tuning Full-Rank | Nova Lite 2.0 (SMHP and SMTJ only) | 
| Multi-turn Reinforcement Fine-tuning LoRA | Nova Lite 2.0 (SMHP Only) | 
| Multi-turn Reinforcement Fine-tuning Full-Rank | Nova Lite 2.0 (SMHP Only) | 

## Getting Started
<a name="nova-forge-sdk-getting-started"></a>

**Topics**
+ [1. Prepare Your Data](#nova-forge-sdk-prepare-data)
+ [2. Configure Your Infrastructure](#nova-forge-sdk-configure-infrastructure)
+ [3. Train](#nova-forge-sdk-train)
+ [4. Monitor](#nova-forge-sdk-monitor)
+ [5. Evaluate](#nova-forge-sdk-evaluate)
+ [6. Deploy](#nova-forge-sdk-deploy)

### 1. Prepare Your Data
<a name="nova-forge-sdk-prepare-data"></a>

Load your dataset from local files or S3, and let the SDK handle the transformation to the correct format for your chosen training method. Or, provide formatted data and get started immediately.

```
from amzn_nova_forge.dataset.dataset_loader import JSONLDatasetLoader
from amzn_nova_forge.model.model_enums import Model, TrainingMethod, TransformMethod

loader = JSONLDatasetLoader()
loader.load("s3://your-bucket/training-data.jsonl")
loader.transform(
    method=TransformMethod.SCHEMA,
    training_method=TrainingMethod.SFT_LORA,
    model=Model.NOVA_LITE_2,
    column_mappings={"question": "input", "answer": "output"},
)
```

### 2. Configure Your Infrastructure
<a name="nova-forge-sdk-configure-infrastructure"></a>

Choose your compute resources—the SDK validates configurations and ensures optimal setup.

```
from amzn_nova_forge.manager.runtime_manager import BedrockRuntimeManager, SMTJRuntimeManager, SMHPRuntimeManager
# Bedrock
runtime = BedrockRuntimeManager(
execution_role="arn:aws:iam::123456789012:role/ExampleRole"
)

# SageMaker Training Jobs
runtime = SMTJRuntimeManager(
    instance_type="ml.p5.48xlarge",
    instance_count=4
)

# SageMaker HyperPod
runtime = SMHPRuntimeManager(
    instance_type="ml.p5.48xlarge",
    instance_count=4,
    cluster_name="my-hyperpod-cluster",
    namespace="kubeflow"
)
```

### 3. Train
<a name="nova-forge-sdk-train"></a>

Start training with just a few lines of code.

```
from amzn_nova_forge.model import NovaModelCustomizer
from amzn_nova_forge.model.model_enums import Model, TrainingMethod

customizer = NovaModelCustomizer(
    model=Model.NOVA_LITE_2,
    method=TrainingMethod.SFT_LORA,
    infra=runtime,
    data_s3_path="s3://your-bucket/prepared-data.jsonl"
)

result = customizer.train(job_name="my-training-job")
```

### 4. Monitor
<a name="nova-forge-sdk-monitor"></a>

Track your training progress directly from the SDK.

```
from amzn_nova_forge.monitor.log_monitor import CloudWatchLogMonitor

# Monitor training logs
customizer.get_logs()

# Or monitor directly via CloudWatchLogMonitor
monitor = CloudWatchLogMonitor.from_job_result(result)
monitor.show_logs(limit=10)

# Check job status
result.get_job_status() # InProgress, Completed, Failed
```

### 5. Evaluate
<a name="nova-forge-sdk-evaluate"></a>

Evaluate model performance with a variety of [built-in benchmarks](https://docs.aws.amazon.com/sagemaker/latest/dg/nova-model-evaluation.html#nova-model-evaluation-benchmark), or design your own evaluations.

```
from amzn_nova_forge.recipe_config.eval_config import EvaluationTask

# Evaluate on benchmark tasks
eval_result = customizer.evaluate(
    job_name="model-eval",
    eval_task=EvaluationTask.MMLU,
    model_path=result.model_artifacts.checkpoint_s3_path
)
```

### 6. Deploy
<a name="nova-forge-sdk-deploy"></a>

Deploy your customized model to production with built-in support for Amazon Bedrock or SageMaker.

```
from amzn_nova_forge.model.model_enums import DeployPlatform

# Bedrock provisioned throughput
deployment = customizer.deploy(
    model_artifact_path=result.model_artifacts.checkpoint_s3_path,
    deploy_platform=DeployPlatform.BEDROCK_PT,
    pt_units=10
)

# Bedrock On-Demand
deployment = customizer.deploy(
    model_artifact_path=result.model_artifacts.checkpoint_s3_path,
    deploy_platform=DeployPlatform.BEDROCK_OD,
    pt_units=10
)

# Sagemaker Real-time Inference
deployment = customizer.deploy(
    model_artifact_path=result.model_artifacts.checkpoint_s3_path,
    deploy_platform=DeployPlatform.SAGEMAKER,
    unit_count=10,
    sagemaker_instance_type="ml.p5.48xlarge",
    sagemaker_environment_variables={
        "CONTEXT_LENGTH": "12000",
        "MAX_CONCURRENCY": "16",
    }
)
```

## Key Capabilities
<a name="nova-forge-sdk-key-capabilities"></a>

### On The Fly Recipe Creation
<a name="nova-forge-sdk-recipe-creation"></a>

The SDK eliminates the need to search for the appropriate recipes or container URI for specific techniques.

### Intelligent Data Processing
<a name="nova-forge-sdk-data-processing"></a>

The SDK automatically transforms your data into the correct format for training. Whether you're working with JSON, JSONL, or CSV files, the data loader handles the conversion seamlessly. Data Loader supports text as well as multimodal data (images and videos).

### Enterprise Infrastructure Support
<a name="nova-forge-sdk-infrastructure-support"></a>

The SDK works with both SageMaker Training Jobs and SageMaker HyperPod, automatically managing:
+ Instance type validation
+ Recipe validation
+ Dataset validation
+ Job orchestration and monitoring

The SDK also supports SageMaker Training Jobs serverless and Bedrock customization.

### Comprehensive evaluation
<a name="nova-forge-sdk-evaluation"></a>

Evaluate your customized models against [standard benchmarks](https://docs.aws.amazon.com/sagemaker/latest/dg/nova-hp-evaluate.html) including:
+ MMLU (Massive Multitask Language Understanding)
+ BBH (Advanced Reasoning Tasks)
+ GPQA (Graduate-Level Google-Proof Q&A)

Either use the benchmark defaults, or modify them to fit your needs:
+ BYOM (Bring Your Own Metric)
+ BYOD (Bring Your Own Dataset)

### Production Deployment
<a name="nova-forge-sdk-deployment"></a>

Deploy your models to Amazon Bedrock or SageMaker AI with options for:
+ **Bedrock Provisioned Throughput** - Dedicated capacity for consistent performance
+ **Bedrock On-Demand (only applicable to LoRA based customization)** - Pay-per-use pricing
+ **SageMaker AI Real-time Inference** - Dedicated capacity for consistent performance

### Batch Inference
<a name="nova-forge-sdk-batch-inference"></a>

Run large-scale inference jobs efficiently:
+ Process thousands of requests in parallel
+ Automatic result aggregation
+ Cost-effective batch processing

### Nova Forge
<a name="nova-forge-sdk-forge"></a>

For Nova Forge subscribers, the SDK supports data mixing recipes.

## Learn More
<a name="nova-forge-sdk-learn-more"></a>

Ready to start customizing Nova models with the Nova Forge SDK? Check out our GitHub repository for detailed guides, API references, and additional examples: [https://github.com/aws/nova-forge-sdk](https://github.com/aws/nova-forge-sdk)