How to query Microsoft Cognitive Service Text Analytics API using cURL in PHP (example code)

The following example shows a simple example of how to query Microsoft Cognitive Service Text Analytics API by posting JSON array through cURL in PHP and extract the entities.

The Text Analytics API is a web service that can interpret the un-organized text for tasks such as sentiment summary, keyphrase extraction and language discovery. Just post the text data and API will use exceptional NLP processing methods to achieve best in class predictions.

The reason I have created this post is due to Microsoft presenting an example PHP code on their website that uses the Apache HTTP client from HTTP Components to accomplish querying of Text Analytics API in PHP. However, I couldn’t get the code working for me, hitting various obstacles with that approach, so I decided to rework the query script to PHP cURL.

It took me a while to figure this out, so I am sharing it in case someone comes across similar issues.

Here is a copy of the working code that sends a sample text (‘Microsoft released Windows 10’) over to API.

Make sure to read the comments in CAPS:

<?php
error_reporting(0);
$bingTextAnalyticsKey = "YOUR API KEY FOR TEXT ANALYTICS API";

$data =array (
    'documents' =>
        array (
            0 =>
                array (
                    'language' => 'en',
                    'id' => '1',
                    'text' => "Microsoft released Windows 10", // HERE GOES YOUR TEXT
                )
        ),
);

$data_string = json_encode($data);

// MAKE SURE TO USE URL FOR THE ENVIRONMENT YOU'VE CONFIGURED, OTHERWISE THIS WILL FAIL
$ch = curl_init('https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/entities'); 

curl_setopt($ch, CURLOPT_CUSTOMREQUEST, "POST");
curl_setopt($ch, CURLOPT_POSTFIELDS, $data_string);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
        "Content-Type: application/json",
        "Ocp-Apim-Subscription-Key: $bingTextAnalyticsKey",
        'Content-Length: ' . strlen($data_string))
);

$result = curl_exec($ch);

if (curl_errno($ch)) {
    echo 'Error:' . curl_error($ch);
}

curl_close ($ch);
$return =  json_decode($result, true);

print "<pre>";
print_r( $return);
print "</pre>";
?>

The above example code would generate the Array that looks like this:

Array
(
    [documents] => Array
        (
            [0] => Array
                (
                    [id] => 1
                    [entities] => Array
                        (
                            [0] => Array
                                (
                                    [name] => Windows 10
                                    [matches] => Array
                                        (
                                            [0] => Array
                                                (
                                                    [text] => Windows 10
                                                    [offset] => 19
                                                    [length] => 10
                                                )

                                        )

                                    [wikipediaLanguage] => en
                                    [wikipediaId] => Windows 10
                                    [wikipediaUrl] => https://en.wikipedia.org/wiki/Windows_10
                                    [bingId] => 5f9fbd03-49c4-39ef-cc95-de83ab897b94
                                )

                            [1] => Array
                                (
                                    [name] => Microsoft
                                    [matches] => Array
                                        (
                                            [0] => Array
                                                (
                                                    [text] => Microsoft
                                                    [offset] => 0
                                                    [length] => 9
                                                )

                                        )

                                    [wikipediaLanguage] => en
                                    [wikipediaId] => Microsoft
                                    [wikipediaUrl] => https://en.wikipedia.org/wiki/Microsoft
                                    [bingId] => a093e9b9-90f5-a3d5-c4b8-5855e1b01f85
                                )

                        )

                )

        )

    [errors] => Array
        (
        )

)

If you wish you can parse out just name entities, this is how you would do it. Just comment out the <pre> section and add this to the bottom of the PHP code and :

for ($i = 0; $i <= count($return[documents][0][entities])-1; $i++) {
    echo $return[documents][0][entities][$i][wikipediaId]."<br>";
}

It’ll generate:

Windows 10
Microsoft

These are essentially the names of Wikipedia articles discovered by Microsoft Cognitive Service Text Analytics API in the sample text (‘Microsoft released Windows 10’).

If you want WIKI urls, just echo $return[documents][0][entities][$i][wikipediaUrl] instead.

I hope this helped.