Beispiel für eine geschwärzte Ausgabe (Batch)Beispiel für eine geschwärzte Streaming-Ausgaben Beispiel für eine Identifikationsausgabe PII

Beispiel für die Ausgabe von PII Redaktion und Identifizierung

Die folgenden Beispiele zeigen die geschwärzte Ausgabe von Batch- und Streaming-Jobs sowie die PII Identifizierung anhand eines Streaming-Jobs.

Transkriptionsaufträge, die die Inhaltsschwärzung verwenden, erzeugen zwei Arten von confidence-Werten. Die Zuverlässigkeit der automatischen Spracherkennung (ASR) gibt die Elemente an, die eine bestimmte type Äußerung haben pronunciation oder punctuation eine bestimmte Äußerung haben. In der folgenden Transkriptausgabe hat das Wort Good einen confidence von 1.0. Dieser Konfidenzwert gibt an, dass zu 100 Prozent sicher Amazon Transcribe ist, dass das in diesem Protokoll gesprochene Wort „Gut“ ist. Der confidence Wert für ein [PII] Schlagwort gibt die Gewissheit an, dass die Rede, die es zur Schwärzung markiert hat, auch wirklich stimmt. PII In der folgenden Transkriptausgabe 0.9999 gibt der Wert confidence von an, dass zu 99,99 Prozent sicher Amazon Transcribe ist, dass es sich bei der Entität, die im Protokoll redigiert wurde, handelt. PII

Beispiel für eine geschwärzte Ausgabe (Batch)


{
    "jobName": "my-first-transcription-job",
    "accountId": "111122223333",
    "isRedacted": true,
    "results": {
        "transcripts": [
            {
                "transcript": "Good morning, everybody. My name is [PII], and today I feel like
                sharing a whole lot of personal information with you. Let's start with my Social 
                Security number [PII]. My credit card number is [PII] and my C V V code is [PII].
                I hope that Amazon Transcribe is doing a good job at redacting that personal 
                information away. Let's check."
            }
        ],
        "items": [
            {
                "id": 0,
                "start_time": "2.86",
                "end_time": "3.35",
                "alternatives": [
                    {
                        "confidence": "1.0",
                        "content": "Good"
                    }
                ],
                "type": "pronunciation"
            },
            Items removed for brevity
            {
                "id": 8,
                "start_time": "5.56",
                "end_time": "6.25",
                "alternatives": [
                    {
                        "content": "[PII]",
                        "redactions": [
                            {
                                "confidence": "0.9999",
                                "type": "NAME",
                                "category": "PII"
                            }
                        ]
                    }
                ],
                "type": "pronunciation"
            },
            Items removed for brevity
        ],
    },
    "status": "COMPLETED"
}

Hier ist das ungekürzte Transkript zum Vergleich:


{
    "jobName": "job id",
    "accountId": "111122223333",
    "isRedacted": false,
    "results": {
        "transcripts": [
            {
                "transcript": "Good morning, everybody. My name is Mike, and today I feel like
                sharing a whole lot of personal information with you. Let's start with my Social 
                Security number 000000000. My credit card number is 5555555555555555 
                and my C V V code is 000. I hope that Amazon Transcribe is doing a good job 
                at redacting that personal information away. Let's check."
            }
        ],
        "items": [
            {
                "id": 0,
                "start_time": "2.86",
                "end_time": "3.35",
                "alternatives": [
                    {
                        "confidence": "1.0",
                        "content": "Good"
                    }
                ],
                "type": "pronunciation"
            },
            Items removed for brevity
            {
                "id": 8,
                "start_time": "5.56",
                "end_time": "6.25",
                "alternatives": [
                    {
                        "confidence": "0.9999",
                        "content": "Mike",
                     {                        
                ],
                "type": "pronunciation"
            },
            Items removed for brevity
        ],
    },
    "status": "COMPLETED"
}

Beispiel für eine geschwärzte Streaming-Ausgaben


{
    "TranscriptResultStream": {
        "TranscriptEvent": {
            "Transcript": {
                "Results": [
                    {
                        "Alternatives": [
                            {
                                "Transcript": "my name is [NAME]",
                                "Items": [
                                    {
                                        "Content": "my",
                                        "EndTime": 0.3799375,
                                        "StartTime": 0.0299375,
                                        "Type": "pronunciation"
                                    },
                                    {
                                        "Content": "name",
                                        "EndTime": 0.5899375,
                                        "StartTime": 0.3899375,
                                        "Type": "pronunciation"
                                    },
                                    {
                                        "Content": "is",
                                        "EndTime": 0.7899375,
                                        "StartTime": 0.5999375,
                                        "Type": "pronunciation"
                                    },
                                    {
                                        "Content": "[NAME]",
                                        "EndTime": 1.0199375,
                                        "StartTime": 0.7999375,
                                        "Type": "pronunciation"
                                    }
                                ],
                                "Entities": [
                                    {
                                        "Content": "[NAME]",
                                        "Category": "PII",
                                        "Type": "NAME",
                                        "StartTime" : 0.7999375,
                                        "EndTime" : 1.0199375,
                                        "Confidence": 0.9989
                                    }
                                ]
                            }
                        ],
                        "EndTime": 1.02,
                        "IsPartial": false,
                        "ResultId": "12345a67-8bc9-0de1-2f34-a5b678c90d12",
                        "StartTime": 0.0199375
                    }
                ]
            }
        }
    }
}

Beispiel für eine Identifikationsausgabe PII

PIIDie Identifizierung ist eine zusätzliche Funktion, die Sie bei Ihrem Streaming-Transkriptionsjob verwenden können. Das Identifizierte PII ist im Entities Abschnitt jedes Segments aufgeführt.


{
    "TranscriptResultStream": {
        "TranscriptEvent": {
            "Transcript": {
                "Results": [
                    {
                        "Alternatives": [
                            {
                                "Transcript": "my name is mike",
                                "Items": [
                                    {
                                        "Content": "my",
                                        "EndTime": 0.3799375,
                                        "StartTime": 0.0299375,
                                        "Type": "pronunciation"
                                    },
                                    {
                                        "Content": "name",
                                        "EndTime": 0.5899375,
                                        "StartTime": 0.3899375,
                                        "Type": "pronunciation"
                                    },
                                    {
                                        "Content": "is",
                                        "EndTime": 0.7899375,
                                        "StartTime": 0.5999375,
                                        "Type": "pronunciation"
                                    },
                                    {
                                        "Content": "mike",
                                        "EndTime": 0.9199375,
                                        "StartTime": 0.7999375,
                                        "Type": "pronunciation"                                    
                                    }
                                ],
                                "Entities": [
                                    {
                                        "Content": "mike",
                                        "Category": "PII",
                                        "Type": "NAME",
                                        "StartTime" : 0.7999375,
                                        "EndTime" : 1.0199375,
                                        "Confidence": 0.9989
                                    }
                                ]
                            }
                        ],
                        "EndTime": 1.02,
                        "IsPartial": false,
                        "ResultId": "12345a67-8bc9-0de1-2f34-a5b678c90d12",
                        "StartTime": 0.0199375
                    }
                ]
            }
        }
    }
}

Warnung JavaScript ist in Ihrem Browser nicht verfügbar oder deaktiviert.

Zur Nutzung der AWS-Dokumentation muss JavaScript aktiviert sein. Weitere Informationen finden auf den Hilfe-Seiten Ihres Browsers.

Dokumentkonventionen

Schwärzen oder Identifizieren von PII in einem Echtzeit-Datenstrom

Erstellen von Untertiteln