.NET(C#)中Puppeteer API的PuppeteerSharp的使用

Puppeteer Sharp是官方Node.JS Puppeteer API的.NET移植。本文主要介绍使用Puppeteer Sharp方法代码和文档。可以生成网页截图,将网页保存成pdf文件,执行Javascript(js)代码等。

1、使用文档

PuppeteerSharp文档http://www.puppeteersharp.com/api/index.html

PuppeteerSharp源码https://github.com/kblok/puppeteer-sharp

2、安装PuppeteerSharp

使用Nuget搜索PuppeteerSharp,找到PuppeteerSharp点击安装即可。

相关文档:VS(Visual Studio)中Nuget的使用

3、PuppeteerSharp的使用

1)网页截图

await new BrowserFetcher().DownloadAsync(BrowserFetcher.DefaultRevision);
var browser = await Puppeteer.LaunchAsync(new LaunchOptions
{
Headless = true
});
var page = await browser.NewPageAsync();
await page.GoToAsync("http://www.google.com");
await page.ScreenshotAsync(outputFile);
生成截图前也可以改变ViewPort:
await page.SetViewport(new ViewPortOptions
{
Width = 500,
Height = 500
});

2)网页保存成pdf文件

await new BrowserFetcher().DownloadAsync(BrowserFetcher.DefaultRevision);
var browser = await Puppeteer.LaunchAsync(new LaunchOptions
{
Headless = true
});
var page = await browser.NewPageAsync();
await page.GoToAsync("http://www.google.com");
await page.PdfAsync(outputFile);

3)向网页中注入HTML

using(var page = await Browser.NewPageAsync())
{
await page.SetContentAsync("<div>My Receipt</div>");
var result = await page.GetContentAsync();
await page.PdfAsync(outputFile);
SaveHtmlToDB(result);
}

4)执行Javascript(js)代码

using (var page = await Browser.NewPageAsync())
{
var seven = await page.EvaluateFunctionAsync<int>("4 + 3");
var someObject = await page.EvaluateFunctionAsync<dynamic>("(value) => ({a: value})", 5);
Console.WriteLine(someObject.a);
}

5)等待Selector内容加载

using (var page = await Browser.NewPageAsync())
{
await page.GoToAsync("http://www.spapage.com");
await page.WaitForSelectorAsync("div.main-content")
await page.PdfAsync(outputFile));
}

6)等待满足Function条件

using (var page = await Browser.NewPageAsync())
{
await page.GoToAsync("http://www.spapage.com");
var watchDog = page.WaitForFunctionAsync("window.innerWidth < 100");
await Page.SetViewport(new ViewPortOptions { Width = 50, Height = 50 });
await watchDog;
}

7)连接到远程浏览器

var options = new ConnectOptions()
{
BrowserWSEndpoint = $"wss://www.externalbrowser.io?token={apikey}"
};
var url = "https://www.google.com/";
using (var browser = await PuppeteerSharp.Puppeteer.ConnectAsync(options))
{
using (var page = await browser.NewPageAsync())
{
await page.GoToAsync(url);
await page.PdfAsync("wot.pdf");
}
}



推荐阅读
cjavapy编程之路首页