一句话就是一个长篇新闻

教程分享
374 0

就是单纯想看标题了解新闻,可正经看新闻内容太长,只想看一句话就能了解发生了什么新闻。

link.img.htmlurl64=aHR0cDovL2ZpbGUuaHU2MC5jbi9maWxlLWhhc2gtanBnLTQxNzAxYWIxMTg2ZTE4NmYyNDQ3ODgyZTNkOTM5NzIxMjEyNjQwLmpwZw..jpg

PHP一版

<?php
$data=call("https://www.liulinblog.com/kuaixun");
preg_match('/<h2 class=\"entry-title\"><a target=\"_blank\" href=\"(.*?)\"/',$data,$matchdata);
$data=call($matchdata[1]);
preg_match('/<\/a><\/p>(.*?)<\/div>/ism',$data,$json);
$json=strip_tags($json[1]);
echo $json;

function call($url){
    $curl = curl_init();
    curl_setopt($curl, CURLOPT_URL,$url);
    curl_setopt($curl, CURLOPT_HEADER, 0);
    curl_setopt($curl, CURLOPT_USERAGENT,"Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36");
    curl_setopt($curl,CURLOPT_TIMEOUT,5);
    curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
    $data = curl_exec($curl);
    curl_close($curl);
if($data)
return $data;
else
return false;

}
?>

PHP二版

<?php
error_reporting(0); 
preg_match('/<script id=\"js-initialData\" type=\"text\/json\">(.*?)<\/script>/ism',file_get_contents("https://www.zhihu.com/people/mt36501"),$matchdata);
$data=json_decode($matchdata[1],true);
$json=end($data['initialState']['entities']['articles']);
$json=strip_tags($json[content],'<p></p>');
echo '<style type="text/css">p:hover{font-weight:bold;} body{background-image:url(bg1.png);} div{border-radius:30px;border:2px solid green;margin:2% 10%;} p{font-size:30px;text-align: left;padding:0% 4%;} </style><div>',$json,'</div>';
?>

Java版

import cn.hutool.core.lang.Console;
import cn.hutool.core.util.ReUtil;
import cn.hutool.http.HttpRequest;

import java.util.List;
import java.util.regex.Pattern;

public class MicroLanguageEngine {

    /**
     * 请求地址
     */
    private static final String BASE_URL = "https://www.liulinblog.com/kuaixun";

    /**
     * 今日微语链接正则
     */
    private static final Pattern TODAY_MICRO_LINK_RE = Pattern.compile("<a.*href=\"(https://www.liulinblog.com/\\d+.html)\">\\s+<img.*alt=\"(.*)\">\\s+</a>");

    /**
     * 今日微语链接正则
     */
    private static final Pattern TODAY_MICRO_CONTENT_RE = Pattern.compile("<h1.*>[\\s\\S]*<section>(\\d+[\\s\\S]*)</section>\\s+</section>");

    public static void main(String[] args) {
        String body = HttpRequest.get(BASE_URL)
                .execute()
                .body();

        // 使用正则获取第一个微语的链接与标题,即当天微语的链接与标题
        List<String> MicroLinks = ReUtil.findAll(TODAY_MICRO_LINK_RE, body, 1);
        String sameDayMicroLink = MicroLinks.get(0);
        List<String> MicroNames = ReUtil.findAll(TODAY_MICRO_LINK_RE, body, 2);
        String sameDayMicroName = MicroNames.get(0);
        Console.log("sameDayMicro: [{}]({})", sameDayMicroName, sameDayMicroLink);


        // 当天微语的HTML文档
        String sameDayDocument = HttpRequest.get(sameDayMicroLink)
                .execute()
                .body();
        String sameDayContent = ReUtil.getGroup1(TODAY_MICRO_CONTENT_RE, sameDayDocument);
        Console.log("sameDayContent: {}", sameDayContent);


    }

}

文章来源于:https://hu60.cn/q.php/bbs.topic.100119.3.html

最后更新 2021-05-21
评论 ( 0 )
OωO
隐私评论