Merge pull request #93 from ruixinxu/master

Fix the issue for cognitive svc tutorial
Azure-Samples · May 25, 2021 · c718ff9 · c718ff9
2 parents 803289a + e331979
commit c718ff9
Showing 1 changed file with 49 additions and 60 deletions.
diff --git a/MachineLearning/Tutorial - Cognitive Service.ipynb b/MachineLearning/Tutorial - Cognitive Service.ipynb
@@ -37,8 +37,7 @@
         "- [LightGBM](https://github.com/Azure/mmlspark/blob/master/docs/lightgbm.md) – LightGBM is a gradient boosting framework that uses tree based learning algorithms. It is designed to be distributed and higher efficiency.\r\n",
         "- Conditional KNN - Scalable KNN Models with Conditional Queries.\r\n",
         "- [HTTP on Spark](https://github.com/Azure/mmlspark/blob/master/docs/http.md) – Enables distributed Microservices orchestration in integrating Spark and HTTP protocol-based accessibility.\r\n",
-        "\r\n",
-        ""
+        "\r\n"
       ]
     },
     {
@@ -61,8 +60,7 @@
         "- Anomaly Detector - detect anomalies within a time series data.\r\n",
         "- Speech to Text - convert streams or files of spoken audio to text.\r\n",
         "\r\n",
-        "If you don't have an Azure subscription, [create a free account before you begin](https://azure.microsoft.com/free/).\r\n",
-        ""
+        "If you don't have an Azure subscription, [create a free account before you begin](https://azure.microsoft.com/free/).\r\n"
       ]
     },
     {
@@ -128,8 +126,7 @@
       "source": [
         "## Text analytics sample\r\n",
         "\r\n",
-        "The [Text Analytics](https://docs.microsoft.com/azure/cognitive-services/text-analytics/) service provides several algorithms for extracting intelligent insights from text. For example, we can find the sentiment of given input text. The service will return a score between 0.0 and 1.0 where low scores indicate negative sentiment and high score indicates positive sentiment. This sample uses three simple sentences and returns the sentiment for each.\r\n",
-        ""
+        "The [Text Analytics](https://docs.microsoft.com/azure/cognitive-services/text-analytics/) service provides several algorithms for extracting intelligent insights from text. For example, we can find the sentiment of given input text. The service will return a score between 0.0 and 1.0 where low scores indicate negative sentiment and high score indicates positive sentiment. This sample uses three simple sentences and returns the sentiment for each.\r\n"
       ]
     },
     {
@@ -188,8 +185,7 @@
         "|--|--|\r\n",
         "| I am frustrated by this rush hour traffic! | negative |\r\n",
         "| this is a dog | neutral |\r\n",
-        "| I am so happy today, its sunny! | positive |\r\n",
-        ""
+        "| I am so happy today, its sunny! | positive |\r\n"
       ]
     },
     {
@@ -206,8 +202,7 @@
         "[Computer Vision](https://docs.microsoft.com/azure/cognitive-services/computer-vision/) analyzes images to identify structure such as faces, objects, and natural-language descriptions. In this sample, we tag the follow image. Tags are one-word descriptions of things in the image like recognizable objects, people, scenery, and actions.\r\n",
         "\r\n",
         "\r\n",
-        "![image](https://raw.githubusercontent.com/Azure-Samples/cognitive-services-sample-data-files/master/ComputerVision/Images/objects.jpg)\r\n",
-        ""
+        "![image](https://raw.githubusercontent.com/Azure-Samples/cognitive-services-sample-data-files/master/ComputerVision/Images/objects.jpg)\r\n"
       ]
     },
     {
@@ -259,8 +254,7 @@
         "\r\n",
         "|image | tags|\r\n",
         "|--|--|\r\n",
-        "| `https://raw.githubusercontent.com/Azure-Samples/cognitive-services-sample-data-files/master/ComputerVision/Images/objects.jpg` | [skating, person, man, outdoor, riding, sport, skateboard, young, board, shirt, air, park, boy, side, jumping, ramp, trick, doing, flying] |\r\n",
-        ""
+        "| `https://raw.githubusercontent.com/Azure-Samples/cognitive-services-sample-data-files/master/ComputerVision/Images/objects.jpg` | [skating, person, man, outdoor, riding, sport, skateboard, young, board, shirt, air, park, boy, side, jumping, ramp, trick, doing, flying] |\r\n"
       ]
     },
     {
@@ -274,8 +268,7 @@
       },
       "source": [
         "## Bing image search sample\r\n",
-        "[Bing Image Search](https://docs.microsoft.com/azure/cognitive-services/bing-image-search/overview) searches the web to retrieve images related to a user's natural language query. In this sample, we use a text query that looks for images with quotes. It returns a list of image URLs that contain photos related to our query.\r\n",
-        ""
+        "[Bing Image Search](https://docs.microsoft.com/azure/cognitive-services/bing-image-search/overview) searches the web to retrieve images related to a user's natural language query. In this sample, we use a text query that looks for images with quotes. It returns a list of image URLs that contain photos related to our query.\r\n"
       ]
     },
     {
@@ -295,29 +288,30 @@
         "collapsed": true
       },
       "source": [
-        "from pyspark.ml import PipelineModel\r\n",
-        "\r\n",
-        "# Number of images Bing will return per query\r\n",
-        "imgsPerBatch = 2\r\n",
-        "# A list of offsets, used to page into the search results\r\n",
-        "offsets = [(i*imgsPerBatch,) for i in range(10)]\r\n",
-        "# Since web content is our data, we create a dataframe with options on that data: offsets\r\n",
-        "bingParameters = spark.createDataFrame(offsets, [\"offset\"])\r\n",
-        "\r\n",
-        "# Run the Bing Image Search service with our text query\r\n",
-        "bingSearch = (BingImageSearch()\r\n",
-        "    .setSubscriptionKey(bingsearch_service_key)\r\n",
-        "    .setOffsetCol(\"offset\")\r\n",
-        "    .setQuery(\"Martin Luther King Jr. quotes\")\r\n",
-        "    .setCount(imgsPerBatch)\r\n",
-        "    .setOutputCol(\"images\"))\r\n",
-        "\r\n",
-        "# Transformer that extracts and flattens the richly structured output of Bing Image Search into a simple URL column\r\n",
-        "getUrls = BingImageSearch.getUrlTransformer(\"images\", \"url\")\r\n",
-        "pipeline_bingsearch = PipelineModel(stages=[bingSearch, getUrls])\r\n",
-        "\r\n",
-        "# Show the results of your search: image URLs\r\n",
-        "res_bingsearch = pipeline_bingsearch.transform(bingParameters)\r\n",
+        "from pyspark.ml import PipelineModel\n",
+        "\n",
+        "# Number of images Bing will return per query\n",
+        "imgsPerBatch = 2\n",
+        "# A list of offsets, used to page into the search results\n",
+        "offsets = [(i*imgsPerBatch,) for i in range(10)]\n",
+        "# Since web content is our data, we create a dataframe with options on that data: offsets\n",
+        "bingParameters = spark.createDataFrame(offsets, [\"offset\"])\n",
+        "\n",
+        "# Run the Bing Image Search service with our text query\n",
+        "bingSearch = (BingImageSearch()\n",
+        "    .setSubscriptionKey(bingsearch_service_key)\n",
+        "    .setOffsetCol(\"offset\")\n",
+        "    .setQuery(\"Martin Luther King Jr. quotes\")\n",
+        "    .setCount(imgsPerBatch)\n",
+        "    .setUrl(\"https://api.bing.microsoft.com/v7.0/images/search\")\n",
+        "    .setOutputCol(\"images\"))\n",
+        "\n",
+        "# Transformer that extracts and flattens the richly structured output of Bing Image Search into a simple URL column\n",
+        "getUrls = BingImageSearch.getUrlTransformer(\"images\", \"url\")\n",
+        "pipeline_bingsearch = PipelineModel(stages=[bingSearch, getUrls])\n",
+        "\n",
+        "# Show the results of your search: image URLs\n",
+        "res_bingsearch = pipeline_bingsearch.transform(bingParameters)\n",
         "display(res_bingsearch.dropDuplicates())"
       ]
     },
@@ -348,8 +342,7 @@
         "| `http://parryz.com/wp-content/uploads/2017/06/Amazing-Martin-Luther-King-Jr-Quotes.jpg` |\r\n",
         "| `http://everydaypowerblog.com/wp-content/uploads/2014/01/Martin-Luther-King-Jr.-Quotes1.jpg` |\r\n",
         "| `https://lessonslearnedinlife.net/wp-content/uploads/2020/05/Martin-Luther-King-Jr.-Quotes-2020.jpg` |\r\n",
-        "| `https://quotesblog.net/wp-content/uploads/2015/10/Martin-Luther-King-Jr-Quotes-Wallpaper.jpg` |\r\n",
-        ""
+        "| `https://quotesblog.net/wp-content/uploads/2015/10/Martin-Luther-King-Jr-Quotes-Wallpaper.jpg` |\r\n"
       ]
     },
     {
@@ -482,20 +475,20 @@
         "collapsed": true
       },
       "source": [
-        "# Create a dataframe with our audio URLs, tied to the column called \"url\"\r\n",
-        "df = spark.createDataFrame([(\"https://mmlspark.blob.core.windows.net/datasets/Speech/audio2.wav\",)\r\n",
-        "                           ], [\"url\"])\r\n",
-        "\r\n",
-        "# Run the Speech-to-text service to translate the audio into text\r\n",
-        "speech_to_text = (SpeechToTextSDK()\r\n",
-        "    .setSubscriptionKey(service_key)\r\n",
-        "    .setLocation(\"northeurope\") # Set the location of your cognitive service\r\n",
-        "    .setOutputCol(\"text\")\r\n",
-        "    .setAudioDataCol(\"url\")\r\n",
-        "    .setLanguage(\"en-US\")\r\n",
-        "    .setProfanity(\"Masked\"))\r\n",
-        "\r\n",
-        "# Show the results of the translation\r\n",
+        "# Create a dataframe with our audio URLs, tied to the column called \"url\"\n",
+        "df = spark.createDataFrame([(\"https://mmlspark.blob.core.windows.net/datasets/Speech/audio2.wav\",)\n",
+        "                           ], [\"url\"])\n",
+        "\n",
+        "# Run the Speech-to-text service to translate the audio into text\n",
+        "speech_to_text = (SpeechToTextSDK()\n",
+        "    .setSubscriptionKey(cognitive_service_key)\n",
+        "    .setLocation(\"northeurope\") # Set the location of your cognitive service\n",
+        "    .setOutputCol(\"text\")\n",
+        "    .setAudioDataCol(\"url\")\n",
+        "    .setLanguage(\"en-US\")\n",
+        "    .setProfanity(\"Masked\"))\n",
+        "\n",
+        "# Show the results of the translation\n",
         "display(speech_to_text.transform(df).select(\"url\", \"text.DisplayText\"))"
       ]
     },
@@ -513,8 +506,7 @@
         "\r\n",
         "|url | DisplayText |\r\n",
         "|--|--|\r\n",
-        "| `https://mmlspark.blob.core.windows.net/datasets/Speech/audio2.wav` | Custom Speech provides tools that allow you to visually inspect the recognition quality of a model by comparing audio data with the corresponding recognition result from the custom speech portal. You can playback uploaded audio and determine if the provided recognition result is correct. This tool allows you to quickly inspect quality of Microsoft's baseline speech to text model or a trained custom model without having to transcribe any audio data.|\r\n",
-        ""
+        "| `https://mmlspark.blob.core.windows.net/datasets/Speech/audio2.wav` | Custom Speech provides tools that allow you to visually inspect the recognition quality of a model by comparing audio data with the corresponding recognition result from the custom speech portal. You can playback uploaded audio and determine if the provided recognition result is correct. This tool allows you to quickly inspect quality of Microsoft's baseline speech to text model or a trained custom model without having to transcribe any audio data.|\r\n"
       ]
     },
     {
@@ -526,9 +518,7 @@
           }
         }
       },
-      "source": [
-        ""
-      ]
+      "source": []
     },
     {
       "cell_type": "markdown",
@@ -559,8 +549,7 @@
         "## Next steps\r\n",
         "\r\n",
         "* [Check out Synapse sample notebooks](https://github.com/Azure-Samples/Synapse/tree/main/MachineLearning) \r\n",
-        "* [MMLSpark GitHub Repo](https://github.com/Azure/mmlspark)\r\n",
-        ""
+        "* [MMLSpark GitHub Repo](https://github.com/Azure/mmlspark)\r\n"
       ]
     }
   ]