月: 2022年2月

CoreAudioを学ぶ(4)

投稿者作成者: Kobito
投稿日 2022-02-06

まずは、UtilityFunctionsから見ていきましょう。見やすくフォーマットされた文字列を表示するためのカスタム関数NSPrintを定義しています。ブログの著者さんはこの記事を参考にしたようです。

void NSPrint(NSString *format, ...) { // [1]
  va_list args; // [2]

  va_start(args, format);
  NSString *string  = [[NSString alloc] initWithFormat:format arguments:args]; // [3]
  va_end(args);

  fprintf(stdout, "%s", [string UTF8String]); // [4]
    
#if !__has_feature(objc_arc)
  [string release]; // [5]
#endif
}

[1] … はC言語の文法で、この関数は「いくつかの引数を得られる」という意味です。stdarg.hに宣言されている va_list, va_start, va_endを使うことが出来ます。

[2] va_list型の変数argsを作成しています。va_listはvariable argument list（可変の変数のリスト）の略で、…の内容が格納されます。このva_listにはva_startとva_endの中でアクセスすることが出来ます。

[3] NSStringの標準のinitializerのinitWithFormatでインスタンス*stringを作り、arguments:に引数argsを渡しています。

[4] fprintfはC言語の関数で、fprintf(ストリーム、フォーマット、…（文字列））のように使います。

[5] ARC (Automatic Reference Counting)がない場合は手動でstringをリリースします。

次は、エラーの有無をチェックするカスタム関数CheckErrorです。

void CheckError(OSStatus error, const char *operation) {
  if (error == noErr) {
    return;
  }
  
  char errorString[20]; // [6]
  *(UInt32 *)(errorString + 1) = CFSwapInt32HostToBig(error); // [7] we have 4 bytes and we put them in Big-endian ordering. 1st byte the biggest
  if (isprint(errorString[1]) && isprint(errorString[2]) && // [8]
      isprint(errorString[3]) && isprint(errorString[4])) {
    errorString[0] = errorString[5] = '\'';
    errorString[6] = '\0';
  } else {
    sprintf(errorString, "%d", (int) error);
  }
  NSLog(@"Error: %s (%s)\n", operation, errorString);
  exit(1);
}

[6] 20バイトの空のcharの配列を作成しています。

[7] *(errorString + 1) のerrorString + 1 はポインタ演算です。errorString配列の2番目の要素（つまりerrorString[1]と同じ）にCFSwapInt32HostToBig(error)の戻り値unsigned intを代入しています。Big-endianは、UInt32の4byteのうち、最初のbyteが一番大きな値となるメモリ上の保存方法です。

[8] isprint( ) はC言語の関数で、そのcharがcharとして表示可能かをチェックするものです。例として’a’は表示可能、’\t’は表示可能ではない、となります。

次に、デフォルトのインプットデバイスのサンプルレートを確認するカスタム関数を見てみましょう。

void GetDefaultInputDeviceSampleRate(Float64 *oSampleRate) {
  AudioObjectPropertyAddress propertyAddress; // [9]
  
  propertyAddress.mSelector = kAudioHardwarePropertyDefaultInputDevice;
  propertyAddress.mScope = kAudioObjectPropertyScopeGlobal;
  propertyAddress.mElement = 0; // master element
  
  AudioDeviceID deviceID = 0;
  UInt32 propertySize = sizeof(AudioDeviceID);
  
  CheckError(AudioObjectGetPropertyData(kAudioObjectSystemObject,
                                        &propertyAddress,
                                        0,
                                        NULL,
                                        &propertySize,
                                        &deviceID), "Getting default input device ID from Audio System Object"); // [9]'
  
  propertyAddress.mSelector = kAudioDevicePropertyNominalSampleRate; // [10]
  propertyAddress.mScope = kAudioObjectPropertyScopeGlobal;
  propertyAddress.mElement = 0;
  propertySize = sizeof(Float64);
  
  CheckError(AudioObjectGetPropertyData(deviceID,
                                        &propertyAddress,
                                        0,
                                        NULL,
                                        &propertySize,
                                        oSampleRate), "Getting nominal sample rate for the default device"); // [10]'
}

[9] ~ [9]’ でdeviceIDを求め、[10] ~ [10]’でsampleRateを求めています。どちらもCore AudioのAudioObjectGetPropertyData( )関数を使っています。

// 以下執筆中。。。

iOS Objective-C

CoreAudioを学ぶ(2)

投稿者作成者: Kobito
投稿日 2022-02-05

パート2はGenerating Raw Audio Samplesを学んで行きます。方形波(square wave)、ノコギリ波(saw wave)、サイン波(sine wave)をプログラム的に生成し、ファイルを作成します。

#import <Foundation/Foundation.h>
#import <AudioToolbox/AudioToolbox.h>

// CD-qualiy sample rate
#define SAMPLE_RATE 44100
#define BITS_PER_SAMPLE 16
#define BYTE_SIZE_IN_BITS 8
#define BYTES_PER_SAMPLE BITS_PER_SAMPLE / BYTE_SIZE_IN_BITS
// We use LPCM so the encoding does not use packets. Hence,
// we are going to have 1 frame per packet.
#define FRAMES_PER_PACKET 1

// number of seconds we want to capture
#define DURATION 5.0
#define FILENAME_FORMAT @"%0.3f-%@.aif"

#define NUMBER_OF_CHANNELS 1

void buildFileURL(double hz, NSString *shape, NSURL** fileURL) {
    NSString* fileName = [NSString stringWithFormat:FILENAME_FORMAT, hz, shape];
    NSString* filePath = [[[NSFileManager defaultManager] currentDirectoryPath]
                          stringByAppendingPathComponent:fileName];
    *fileURL = [NSURL fileURLWithPath:filePath];
}

void buildAudioStreamBasicDescription(AudioStreamBasicDescription* audioStreamBasicDescription) {
    memset(audioStreamBasicDescription, 0, sizeof(AudioStreamBasicDescription)); // [5]
    
    audioStreamBasicDescription->mSampleRate = SAMPLE_RATE; // [6]
    audioStreamBasicDescription->mFormatID = kAudioFormatLinearPCM;
    audioStreamBasicDescription->mFormatFlags = kAudioFormatFlagIsBigEndian | kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked;
    audioStreamBasicDescription->mBitsPerChannel = BITS_PER_SAMPLE;
    audioStreamBasicDescription->mChannelsPerFrame = NUMBER_OF_CHANNELS;
    audioStreamBasicDescription->mFramesPerPacket = FRAMES_PER_PACKET;
    audioStreamBasicDescription->mBytesPerFrame = BYTES_PER_SAMPLE * NUMBER_OF_CHANNELS;
    audioStreamBasicDescription->mBytesPerPacket = audioStreamBasicDescription->mFramesPerPacket * audioStreamBasicDescription->mBytesPerFrame;
} // buildAudioStreamBasicDescription

SInt16 generateSineShapeSample(int i, double waveLengthInSamples) {
    assert(i >= 1 && i <= waveLengthInSamples);
    
    return (SInt16)(SHRT_MAX * sin(2 * M_PI * (i - 1) / waveLengthInSamples));
}

SInt16 generateSquareShapeSample(int i, double waveLengthInSamples) {
    assert(i >= 1 && i <= waveLengthInSamples);
    
    if (i <= waveLengthInSamples / 2) {
        return SHRT_MAX;
    } else {
        return SHRT_MIN;
    }
}

SInt16 generateSawShapeSample(int i, double waveLengthInSamples) {
    assert(i >= 1 && i <= waveLengthInSamples);
    
    return (SInt16)(2 * SHRT_MAX / waveLengthInSamples * (i - 1) - SHRT_MAX);
}

NSString* correctShape(NSString *shape) { // [2]'
    if ([shape isEqualToString:@"square"] ||
        [shape isEqualToString:@"saw"] ||
        [shape isEqualToString:@"sine"])
    {
        return shape;
    } else {
        return @"square";
    }
}

int main(int argc, const char * argv[]) {
    if (argc < 2) {
        printf("Usage: WriteRawAudioSamples n <shape>\nwhere n is tone in Hz, shape is one of 'square' (default), 'saw', 'sine'\n");
        return -1;
    }
    
    @autoreleasepool {
        double hz = atof(argv[1]); // [1]
        assert(hz > 0);
        
        NSString *shape;
        if (argc == 2) {
            shape = @"square"; // [2]
        } else {
            shape = [NSString stringWithFormat:@"%s", argv[2]];
            shape = correctShape(shape);
        }
        
        NSLog(@"generating %f hz tone with shape %@...", hz, shape);
        
        NSURL *fileURL = NULL;
        buildFileURL(hz, shape, &fileURL);
        
        // Prepare the format
        AudioStreamBasicDescription audioStreamBasicDescription; // [3]
        buildAudioStreamBasicDescription(&audioStreamBasicDescription); // [4]
        
        // Set up the file
        AudioFileID audioFile; // [7]
        OSStatus error = noErr;
        
        error = AudioFileCreateWithURL((__bridge CFURLRef)fileURL,
                                       kAudioFileAIFFType,
                                       &audioStreamBasicDescription,
                                       kAudioFileFlags_EraseFile,
                                       &audioFile);
        assert(error == noErr);
        
        // Start writing samples;
        long maxSampleCount = SAMPLE_RATE * DURATION; // [8]
        
        long sampleCount = 1;
        UInt32 bytesToWrite = BYTES_PER_SAMPLE; // [9]
        double waveLengthInSamples = SAMPLE_RATE / hz; // [10]
        NSLog(@"wave (or cycle) length in samples: %.4f\n", waveLengthInSamples);
        
        while (sampleCount <= maxSampleCount) { // [11]
            for(int i = 1; i <= waveLengthInSamples; i++) {
                SInt16 sample = 0;
                
                if ([shape isEqualToString:@"square"]) {
                    sample = generateSquareShapeSample(i, waveLengthInSamples);
                }
                else if ([shape isEqualToString:@"saw"]) {
                    sample = generateSawShapeSample(i, waveLengthInSamples);
                } else if ([shape isEqualToString:@"sine"]) {
                    sample = generateSineShapeSample(i, waveLengthInSamples);
                }
                sample = CFSwapInt16HostToBig(sample);
                
                SInt64 offset = sampleCount * bytesToWrite;
                error = AudioFileWriteBytes(audioFile, false, offset, &bytesToWrite, &sample);
                assert(error == noErr);
                
                sampleCount++;
            }
        }
        error = AudioFileClose(audioFile);
        assert(error == noErr);
        NSLog(@"wrote %ld samples", sampleCount);
    }
    return 0;
}

[1] atof()というC言語の文字列をDouble型へ変換する関数を使って、周波数を代入しています。

[2] 実行ファイル実行時の引数が2の場合はデフォルトの”square”（方形波）が選ばれ、それ以外の場合は引数に応じて”saw”（ノコギリ波）、”sine”（サイン波）が選択されます。”square”, “saw”, “sine”以外の文字列が引数として渡された場合は”square”が選択されます[2]’。

[3] オーディオデータをファイルに書き込むにはAudioStreamBasicDescriptionのインスタンスを用いて、CoreAudio APIにどのようなフォーマットのデータを書き込みたいかを伝える必要があります。

[4] [5] まず、memset( )を使い構造体AudioStreamBasicDescriptionのサイズ分のメモリを確保し、そのメモリの全てのバイトに0（ゼロ）を書き込みます。これはセットしないパラメータに意図しないランダムな値が代入されていないようにするグッドプラクティスだそうです。

[6] 必要なパラメータを設定していきます。

[7] AudioFileCreateWithURL( )を使い、オーディオファイルを作成します。

[8] SAMPLE_RATE（一秒間に何箇所数値化するか） x DURATION（秒数）で、総サンプル数を計算します。この例では 44,100 x 5 = 220,500サンプルです。

[9] これは単純に、ビット深度(bits per sample, またはbit depth)が16の場合、byteで表すとなにか？を設定しています。1 byte == 8 bitですから、bytesToWriteは2です。

[10] サンプルレートを音の高さ（周波数、hz）で割ると、1サイクルにいくつサンプルがあるのかを計算できます。A4（440hz）の場合、44,100 / 440 = 100.2272…です。

[11] 波形タイプに応じて、サンプル毎の数値を計算し、メモリのoffsetの位置を sampleCount （何サンプル目） x bytesToWrite（サンプル毎のデータの大きさ）としてそこに書き込んでいきます。

iOS Objective-C

CoreAudioを学ぶ(1)

投稿者作成者: Kobito
投稿日 2022-02-05

AppleのCoreAudioを学ぶために資料を探していたところ、Learning Core Audioという本を見つけました。ただ、この本はObjective-Cで書かれていたり、少し古い本なのでコードがところどころDeprecatedになっているので困っていたところ、タイムリーにとてもいいブログを発見しました。Leaning Core Audioを読み解き、現在もコンパイルするコードを紹介してくれています。初学者がつまずきそうな箇所も丁寧に解説してくれているのがありがたいです！Objective-Cの勉強も兼ねて、このブログを読んで行こうと思います。ブログ著者さんの丁寧なティーチングスタイルがありがたかったので、僕も真似して出来るだけ端折らない記事を書いてみたいと思います。

まずは一番最初の記事、Reading Basic Info From a Local Audio Fileです。

#import <Foundation/Foundation.h>
#import <AudioToolbox/AudioToolbox.h>

void GetAudioFileInformationProperty(AudioFileID audioFile, CFDictionaryRef *dictionary) {
    OSStatus theErr = noErr;
    UInt32 dictionarySize = 0;
    theErr = AudioFileGetPropertyInfo(audioFile,
                                      kAudioFilePropertyInfoDictionary,
                                      &dictionarySize,
                                      0);
    assert(theErr == noErr);
    
    theErr = AudioFileGetProperty(audioFile,
                                  kAudioFilePropertyInfoDictionary,
                                  &dictionarySize,
                                  dictionary);
    assert(theErr == noErr);
}

int main(int argc, const char * argv[]) {
    @autoreleasepool {
        if (argc < 2) { // [1]
            printf("Usage: CAMetadata fullpath/to/audiofile\n");
            return -1;
        }
        
        // [2] 
        NSString *audioFilePath = [[NSString stringWithUTF8String:argv[1]] stringByExpandingTildeInPath];
        NSURL *audioURL = [NSURL fileURLWithPath:audioFilePath];
        
        // [3]
        AudioFileID audioFile;
        OSStatus theErr = noErr;
        
        // [4]
        // [5] 
        theErr = AudioFileOpenURL((__bridge CFURLRef)audioURL, kAudioFileReadPermission, 0, &audioFile);
        
        // [6] 
        assert(theErr == noErr);
        
        CFDictionaryRef dictionary;
        
        GetAudioFileInformationProperty(audioFile, &dictionary);
        
        NSLog(@"dictionary: %@", dictionary);
        
        CFRelease(dictionary);
        
        theErr = AudioFileClose(audioFile);
        
        assert(theErr == noErr);
    }
    return 0;
}

[1] argcとは実行ファイルを実行する時の引数の数です。Argument Countの略だと思います。C言語の実行ファイル実行時は [実行ファイル名] [引数1] [引数2] … という形でプログラムを呼びます（もちろん[ ]は省いてください）。実行ファイル名そのものもカウントされるので、引数が2つの場合argcは3です。argcが2よりも小さい場合、-1をリターンしてプログラムを終了させています。

[2]まず、Objective-Cスタイルのメソッド実行の書き方に面食らいますよね？Objective-Cではクラスからインスタンスを作る時、NSString *audioFilePathというふうに「ポインタ」で作成します。そして、メソッドの実行は[クラス名クラスメソッド名:引数]と[ ]で囲むように書きます（クラスメソッドの場合）

NSString *audioFilePath = [[NSString stringWithUTF8String:argv[1]] stringByExpandingTildeInPath];

stringWithUTF8String メソッドはノン・ラテン文字もパースさせるためのもの。stringByExpandingTildeInPath はパスの中に「~」が存在する時にパースするためのものです。

Swiftに書き直すと、以下のようになると思います。（意訳です）

let audioFilePath: NSString = NSString.stringWithUTF8String(argv[1]).stringByExpandingTildeInPath()

[3] AudioFileID audioFile; とインスタンス作成をしています。なぜポインタで作成しないのか？AudioFileIDの定義を見てみると、typedef struct OpaqueAudioFileID *AudioFileID; と、構造体OpaqueAudioFileIDを*AudioFileIDというポインタでtypedefしているため、とわかりました。Swiftのコードを見ているとよくOpaquePointerとUnsafePointerというものが出てくるのですが、いまいち違いが分かっていませんでしたが、この記事を読むと少し違いが分かりました。C言語ではプログラムがheaderファイルとsourceファイルに分かれていますが、headerファイルに構造体の内容の記述がある場合はUnsafePointer、headerファイルには宣言のみで、sourceファイルに構造体の内容の記述がある場合はOpaquePointerとなるようです。

[4] __bridgeについても分からなかったので調べたところ、CoreFoundationのオブジェクトはARCの管理対象ではないため、それを管理対象とするためのキーワードのようです。

[5] CoreAudioではこのように、メソッドの引数にポインタのアドレス（&audioFile）を渡し、そこに値を代入する場合が多いように思います。

[6] assertは( )内の条件がtrueではない場合にログに吐き出すものです。

XcodeのCommand Line Toolの実行ファイルは、~/Library/Developer/Xcode/DerivedData/プロジェクト名/Products/Debugにありました。以下のようなコマンドで呼べます。

./CAMetadata ~/Desktop/sample.mp3

Recent Posts

Recent Comments

Archives

Categories