5

LINQ ForEach() 與 async / await

 1 year ago
source link: https://blog.darkthread.net/blog/linq-foreach-n-async/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

LINQ ForEach() 與 async

犯了 async/await 低級錯誤,鬼打牆近半小時,PO 文留念。

.NET 4.5/C# 5.0 開始引進 Asynchronous Function 概念及 async/await 保留字,非同步化函式漸漸成為 .NET 的主流寫法,以取代 WebClient/HttpWebRequest 的 HttpClient 類別為例,提供的幾乎都是 ***Async() 非同步方法:

Fig1_638075239272849970.png

之前遇到 ***Async() 方法,我常用的技倆是 .Result/.GetAwaiter().GetResult()/.Wait() 後回歸熟悉的同步化寫法。隨著 .NET 6 程式愈寫愈多、async/await 在官方範例到處都是,想想自己也該與時俱進,不能再鴕鳥下去,因此最近新開專案,我學著大量改用 async/await 寫程式。

【補充】還不熟悉 async/await 的同學,建議建立基本觀念再繼續看下去。

今天在寫一段測試程式,由於涉及 HttpClient GetAsync(),我將用到它的測試方法由 void 改成 async Task,而測試方法有段類似取回 List<string> urls 所指網頁內容的邏輯,我很直覺寫成 LINQ ForEach 加 Statement Lambda (o) => { ... },配合非同步就改寫成 async (o) => { ... await httpClient.GetAsync(...) ... },心中想像這樣會逐筆循序跑迴圈,每次等待非同步呼叫結果再繼續。不料,ForEach 迴圈像是沒執行一樣,測試失敗,鬼打牆快半小時,才發現自己犯了低級錯誤。

用以下程式重現問題:(我在 Console 專案參照 MSTest.TestFramework 套件,借 Assert.AreEqual() 來用)

using Microsoft.VisualStudio.TestTools.UnitTesting;

var results = new Dictionary<string, string>();

await FillResults(results);
try
{
    Assert.AreEqual(2, results.Count);
    Console.WriteLine("Success");
}
catch (Exception ex)
{
    Console.WriteLine("ERROR:" + ex.Message);
}

async Task FillResults(Dictionary<string, string> results)
{
    var httpClient = new HttpClient();
    //var res = await httpClient.GetAsync(url_to_download_list);    
    //var urls = (await res.Content.ReadAsStringAsync())
    //    .Split(new[] { Environment.NewLine }, StringSplitOptions.RemoveEmptyEntries);
    // 用固定 URL 清單模擬下載結果
    var urls = new List<string> {
        "https://www.google.com",
        "https://www.microsoft.com"
    };
    urls.ForEach(async url =>
    {
        var response = await httpClient.GetAsync(url);
        results.Add(url, await response.Content.ReadAsStringAsync());
    });
}

執行結果會得到 ERROR:Assert.AreEqual failed. Expected:<2>. Actual:<0>.,逐行偵錯時 ForEach 內容像是沒執行一樣。

撞牆一陣子,仔細一想才驚覺這裡犯了兩個錯:

  1. 改成 async url => { ... } 意義上類似 Task downloadUrl(string url) ,除非搭配 await 或轉同步化,ForEach 會逐一執行但不等待其結束
  2. 由於 ForEach 陸續啟動各筆 url 下載同時執行,results.Add() 被多執行緒呼叫,這段有 Thread-Safe 問題。

如果要回歸程式原意採循序執行並等待結果,最簡單的改法是改用 foreach (var url in urls) :

async Task FillResults(Dictionary<string, string> results)
{
    var httpClient = new HttpClient();
    // ...   
    var urls = new List<string> {
        "https://www.google.com",
        "https://www.microsoft.com"
    };
    foreach (var url in urls)
    {
        var response = await httpClient.GetAsync(url);
        results.Add(url, await response.Content.ReadAsStringAsync());
    }
}

若要走平行處理同時下載,做法有很多種,例如用 .NET 6 推出的 Parallel.ForEachAsync 寫入結果時再加 lock 解決多執行緒問題。

async Task FillResults(Dictionary<string, string> results)
{
    var httpClient = new HttpClient();
    // ... 略 ....
    var urls = new List<string> {
        "https://www.google.com",
        "https://www.microsoft.com"
    };

    await Parallel.ForEachAsync(urls, new ParallelOptions {
        MaxDegreeOfParallelism = 2
    }, async (url, token) =>
    {
        var response = await httpClient.GetAsync(url);
        // await 不能包在 lock 中
        // REF: https://blog.darkthread.net/blog/cs-in-depth-notes-6/
        var html = await response.Content.ReadAsStringAsync();
        lock (results) 
        {
            results.Add(url, html);
        }
    });
}

就醬,再增加一些實戰經驗。


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK