用python语言写一个采集程序,用户输入城市名和采集数量,程序根据用户输入项,采集所在城市的公司信息,信息包括公司名、地址、电话、网址、邮箱、法人和经营范围。程序需要注释。
整理自网络ChatGPT产生之内容,文本内容不具备参考意义,程序内容及代码片段有且仅有借鉴意义。
请参考下面的代码:
# coding:utf-8
# 引入需要的第三方模块
import requests
import re
# 用户输入城市名和采集数量
city = input('请输入您要查询的城市:')
number = input('请输入您要获取的公司数量:')
# 定义要采集的url
url = 'http://www.qichacha.com/search?key={}'.format(city)
# 发送请求获取相应数据
response = requests.get(url).text
# 通过正则提取需要的公司信息
companies = re.findall('', response, re.S)
# 初始化一个用来存储公司信息的字典
result = []
# 遍历提取的信息
for each in companies[1:int(number)+1]:
# 提取公司名
name = re.findall('data-register_no="(.*?)">', each, re.S)[0]
# 提取公司地址
address = re.findall('data-register_capital="(.*?)"', each, re.S)[0]
# 提取公司电话
tel = re.findall('(.*?) | \s*?', each, re.S)[0]
# 提取公司法人
legal_person = re.findall('="_blank">(.*?)\s*? | (.*?) | \s*?', each, re.S)[0]
# 将采集的所有信息放入一个字典中
data = {"name":name, "address":address, "tel":tel,
"website":website, "email":email,
"legal_person":legal_person, "business_scope":business_scope}
# 将所有信息的字典放入一个大的字典
result.append(data)
# 打印采集结果
for each in result:
print(each)
Public @ 2023-02-18 13:12:04 整理自网络ChatGPT产生之内容,文本内容不具备参考意义,程序内容有且仅有借鉴意义。
更多您感兴趣的搜索
基本
文件
流程
错误
SQL
调试
- 请求信息 : 2025-01-30 16:02:59 HTTP/1.1 GET : bninc.cn/chatgpt/detail/id/9.html
- 运行时间 : 0.094595s [ 吞吐率:10.57req/s ] 内存消耗:2,166.95kb 文件加载:85
- 查询信息 : 11 queries 1 writes
- 缓存信息 : 0 reads,0 writes
- 配置加载 : 92
- /www/wwwroot/www.briline.net/public/index.php ( 0.79 KB )
- /www/wwwroot/www.briline.net/public/public.php ( 1.08 KB )
- /www/wwwroot/www.briline.net/thinkphp/start.php ( 0.73 KB )
- /www/wwwroot/www.briline.net/thinkphp/base.php ( 2.66 KB )
- /www/wwwroot/www.briline.net/thinkphp/library/think/Loader.php ( 19.47 KB )
- /www/wwwroot/www.briline.net/vendor/composer/autoload_namespaces.php ( 0.21 KB )
- /www/wwwroot/www.briline.net/vendor/composer/autoload_psr4.php ( 0.84 KB )
- /www/wwwroot/www.briline.net/vendor/composer/autoload_classmap.php ( 0.14 KB )
- /www/wwwroot/www.briline.net/vendor/composer/autoload_files.php ( 0.42 KB )
- /www/wwwroot/www.briline.net/vendor/qiniu/php-sdk/src/Qiniu/functions.php ( 7.10 KB )
- /www/wwwroot/www.briline.net/vendor/qiniu/php-sdk/src/Qiniu/Config.php ( 0.70 KB )
- /www/wwwroot/www.briline.net/vendor/topthink/think-captcha/src/helper.php ( 1.59 KB )
- /www/wwwroot/www.briline.net/thinkphp/library/think/Route.php ( 59.82 KB )
- /www/wwwroot/www.briline.net/thinkphp/library/think/Config.php ( 6.03 KB )
- /www/wwwroot/www.briline.net/thinkphp/library/think/Validate.php ( 40.27 KB )
- /www/wwwroot/www.briline.net/vendor/topthink/think-queue/src/config.php ( 0.77 KB )
- /www/wwwroot/www.briline.net/thinkphp/library/think/Console.php ( 21.22 KB )
- /www/wwwroot/www.briline.net/thinkphp/library/think/Error.php ( 3.59 KB )
- /www/wwwroot/www.briline.net/thinkphp/convention.php ( 10.31 KB )
- /www/wwwroot/www.briline.net/thinkphp/library/think/App.php ( 21.04 KB )
- /www/wwwroot/www.briline.net/thinkphp/library/think/Request.php ( 50.94 KB )
- /www/wwwroot/www.briline.net/app/config.php ( 9.67 KB )
- /www/wwwroot/www.briline.net/app/database.php ( 1.40 KB )
- /www/wwwroot/www.briline.net/thinkphp/library/think/Hook.php ( 4.76 KB )
- /www/wwwroot/www.briline.net/app/tags.php ( 1.16 KB )
- /www/wwwroot/www.briline.net/app/common/behavior/InitBase.php ( 8.17 KB )
- /www/wwwroot/www.briline.net/app/common.php ( 23.30 KB )
- /www/wwwroot/www.briline.net/thinkphp/library/think/Env.php ( 1.25 KB )
- /www/wwwroot/www.briline.net/thinkphp/helper.php ( 17.86 KB )
- /www/wwwroot/www.briline.net/app/function.php ( 0.78 KB )
- /www/wwwroot/www.briline.net/app/extend.php ( 13.29 KB )
- /www/wwwroot/www.briline.net/thinkphp/library/think/Debug.php ( 7.06 KB )
- /www/wwwroot/www.briline.net/app/common/model/Config.php ( 0.78 KB )
- /www/wwwroot/www.briline.net/app/common/model/ModelBase.php ( 12.18 KB )
- /www/wwwroot/www.briline.net/thinkphp/library/think/Model.php ( 66.83 KB )
- /www/wwwroot/www.briline.net/thinkphp/library/think/Db.php ( 6.54 KB )
- /www/wwwroot/www.briline.net/thinkphp/library/think/Log.php ( 5.84 KB )
- /www/wwwroot/www.briline.net/thinkphp/library/think/db/connector/Mysql.php ( 3.94 KB )
- /www/wwwroot/www.briline.net/thinkphp/library/think/db/Connection.php ( 29.97 KB )
- /www/wwwroot/www.briline.net/thinkphp/library/think/db/Query.php ( 89.54 KB )
- /www/wwwroot/www.briline.net/thinkphp/library/think/db/builder/Mysql.php ( 2.16 KB )
- /www/wwwroot/www.briline.net/thinkphp/library/think/db/Builder.php ( 30.47 KB )
- /www/wwwroot/www.briline.net/thinkphp/library/think/Cache.php ( 6.17 KB )
- /www/wwwroot/www.briline.net/thinkphp/library/think/cache/driver/File.php ( 6.98 KB )
- /www/wwwroot/www.briline.net/thinkphp/library/think/cache/Driver.php ( 5.52 KB )
- /www/wwwroot/www.briline.net/app/common/behavior/InitHook.php ( 1.25 KB )
- /www/wwwroot/www.briline.net/app/common/model/Hook.php ( 0.77 KB )
- /www/wwwroot/www.briline.net/thinkphp/library/think/Lang.php ( 6.95 KB )
- /www/wwwroot/www.briline.net/thinkphp/lang/zh-cn.php ( 3.85 KB )
- /www/wwwroot/www.briline.net/app/route.php ( 0.91 KB )
- /www/wwwroot/www.briline.net/app/index/config.php ( 0.96 KB )
- /www/wwwroot/www.briline.net/app/index/common.php ( 0.68 KB )
- /www/wwwroot/www.briline.net/app/index/controller/Chatgpt.php ( 3.64 KB )
- /www/wwwroot/www.briline.net/app/index/controller/IndexBase.php ( 1.10 KB )
- /www/wwwroot/www.briline.net/app/common/controller/ControllerBase.php ( 4.75 KB )
- /www/wwwroot/www.briline.net/thinkphp/library/think/Controller.php ( 6.20 KB )
- /www/wwwroot/www.briline.net/thinkphp/library/traits/controller/Jump.php ( 4.97 KB )
- /www/wwwroot/www.briline.net/thinkphp/library/think/View.php ( 6.86 KB )
- /www/wwwroot/www.briline.net/thinkphp/library/think/view/driver/Think.php ( 5.61 KB )
- /www/wwwroot/www.briline.net/thinkphp/library/think/Template.php ( 46.46 KB )
- /www/wwwroot/www.briline.net/thinkphp/library/think/template/driver/File.php ( 2.24 KB )
- /www/wwwroot/www.briline.net/extend/qrcode/qrlib.php ( 1.61 KB )
- /www/wwwroot/www.briline.net/extend/qrcode/qrconst.php ( 1.72 KB )
- /www/wwwroot/www.briline.net/extend/qrcode/qrconfig.php ( 1.41 KB )
- /www/wwwroot/www.briline.net/extend/qrcode/qrtools.php ( 6.18 KB )
- /www/wwwroot/www.briline.net/extend/qrcode/qrspec.php ( 25.99 KB )
- /www/wwwroot/www.briline.net/extend/qrcode/qrimage.php ( 3.54 KB )
- /www/wwwroot/www.briline.net/extend/qrcode/qrinput.php ( 24.22 KB )
- /www/wwwroot/www.briline.net/extend/qrcode/qrbitstream.php ( 5.26 KB )
- /www/wwwroot/www.briline.net/extend/qrcode/qrsplit.php ( 11.11 KB )
- /www/wwwroot/www.briline.net/extend/qrcode/qrrscode.php ( 8.29 KB )
- /www/wwwroot/www.briline.net/extend/qrcode/qrmask.php ( 12.29 KB )
- /www/wwwroot/www.briline.net/extend/qrcode/qrencode.php ( 17.10 KB )
- /www/wwwroot/www.briline.net/app/index/logic/Chatgpt.php ( 6.09 KB )
- /www/wwwroot/www.briline.net/app/index/logic/IndexBase.php ( 0.79 KB )
- /www/wwwroot/www.briline.net/app/common/logic/LogicBase.php ( 0.83 KB )
- /www/wwwroot/www.briline.net/app/common/model/Chatgpt.php ( 0.78 KB )
- /www/wwwroot/www.briline.net/app/common/model/ArticleTongji.php ( 0.79 KB )
- /www/wwwroot/www.briline.net/app/common/model/Article.php ( 0.78 KB )
- /www/wwwroot/www.briline.net/thinkphp/library/think/paginator/driver/Bootstrap.php ( 5.90 KB )
- /www/wwwroot/www.briline.net/thinkphp/library/think/Paginator.php ( 9.45 KB )
- /www/wwwroot/www.briline.net/thinkphp/library/think/Collection.php ( 8.63 KB )
- /www/wwwroot/www.briline.net/runtime/temp/45645fa982bc979b4fc6832138f1b4ce.php ( 56.78 KB )
- /www/wwwroot/www.briline.net/thinkphp/library/think/Response.php ( 8.64 KB )
- /www/wwwroot/www.briline.net/thinkphp/library/think/debug/Html.php ( 4.27 KB )
- [ DB ] INIT mysql
- [ CACHE ] INIT File
- [ BEHAVIOR ] Run app\common\behavior\InitBase @app_init [ RunTime:0.002510s ]
- [ BEHAVIOR ] Run app\common\behavior\InitHook @app_init [ RunTime:0.000470s ]
- [ LANG ] /www/wwwroot/www.briline.net/thinkphp/lang/zh-cn.php
- [ BIND ] 'index'
- [ ROUTE ] array (
'type' => 'module',
'module' =>
array (
0 => 'index',
1 => 'chatgpt',
2 => 'detail',
),
)
- [ HEADER ] array (
'accept' => '*/*',
'user-agent' => 'Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)',
'accept-encoding' => 'gzip, br, zstd, deflate',
'host' => 'bninc.cn',
'content-type' => '',
'content-length' => '',
)
- [ PARAM ] array (
'id' => '9',
)
- [ RUN ] app\index\controller\Chatgpt->detail[ /www/wwwroot/www.briline.net/app/index/controller/Chatgpt.php ]
- [ CACHE ] INIT File
- [ VIEW ] /www/wwwroot/www.briline.net/public/../app/index/view/chatgpt/detail.html [ array (
0 => 'loading_icon',
1 => 'pjax_mode',
2 => 'static_root',
3 => 'qrcode',
4 => 'seo',
5 => 'catelist',
6 => 'tagslist',
7 => 'moretagslist',
8 => 'art',
9 => 'cate_article_list',
10 => 'tags_article_list',
) ]
- [ DB ] CONNECT:[ UseTime:0.000350s ] mysql:dbname=briline;host=127.0.0.1;port=3306;charset=utf8
- [ SQL ] SHOW COLUMNS FROM `ob_chatgpt` [ RunTime:0.000548s ]
- [ SQL ] SELECT * FROM `ob_chatgpt` WHERE `id` = 9 LIMIT 1 [ RunTime:0.000301s ]
- [ EXPLAIN : array (
'id' => 1,
'select_type' => 'SIMPLE',
'table' => 'ob_chatgpt',
'partitions' => NULL,
'type' => 'const',
'possible_keys' => 'PRIMARY',
'key' => 'PRIMARY',
'key_len' => '4',
'ref' => 'const',
'rows' => 1,
'filtered' => 100,
'extra' => NULL,
) ]
- [ SQL ] select * from `ob_article_tongji` where category_id=12 and mark_type='cate' order by times desc limit 15 [ RunTime:0.000571s ]
- [ EXPLAIN : array (
'id' => 1,
'select_type' => 'SIMPLE',
'table' => 'ob_article_tongji',
'partitions' => NULL,
'type' => 'ALL',
'possible_keys' => NULL,
'key' => NULL,
'key_len' => NULL,
'ref' => NULL,
'rows' => 608,
'filtered' => 1.0000001192092896,
'extra' => 'Using where; Using filesort',
) ]
- [ SQL ] select * from `ob_article_tongji` where category_id=12 and mark_type='tags' order by times desc limit 100 [ RunTime:0.000569s ]
- [ EXPLAIN : array (
'id' => 1,
'select_type' => 'SIMPLE',
'table' => 'ob_article_tongji',
'partitions' => NULL,
'type' => 'ALL',
'possible_keys' => NULL,
'key' => NULL,
'key_len' => NULL,
'ref' => NULL,
'rows' => 608,
'filtered' => 1.0000001192092896,
'extra' => 'Using where; Using filesort',
) ]
- [ SQL ] select * from `ob_article_tongji` where category_id=12 and mark_type='tags' order by rand() limit 30 [ RunTime:0.000746s ]
- [ EXPLAIN : array (
'id' => 1,
'select_type' => 'SIMPLE',
'table' => 'ob_article_tongji',
'partitions' => NULL,
'type' => 'ALL',
'possible_keys' => NULL,
'key' => NULL,
'key_len' => NULL,
'ref' => NULL,
'rows' => 608,
'filtered' => 1.0000001192092896,
'extra' => 'Using where; Using temporary; Using filesort',
) ]
- [ SQL ] SELECT * FROM `ob_chatgpt` WHERE `id` = 9 LIMIT 1 [ RunTime:0.000178s ]
- [ EXPLAIN : array (
'id' => 1,
'select_type' => 'SIMPLE',
'table' => 'ob_chatgpt',
'partitions' => NULL,
'type' => 'const',
'possible_keys' => 'PRIMARY',
'key' => 'PRIMARY',
'key_len' => '4',
'ref' => 'const',
'rows' => 1,
'filtered' => 100,
'extra' => NULL,
) ]
- [ SQL ] update `ob_chatgpt` set views=views+1 where id=9 [ RunTime:0.002552s ]
- [ SQL ] SHOW COLUMNS FROM `ob_article` [ RunTime:0.000439s ]
- [ SQL ] SELECT COUNT(*) AS tp_count FROM `ob_article` WHERE `category_id` = 12 AND `cate` = 'program' AND `status` <> -1 LIMIT 1 [ RunTime:0.005273s ]
- [ EXPLAIN : array (
'id' => 1,
'select_type' => 'SIMPLE',
'table' => 'ob_article',
'partitions' => NULL,
'type' => 'ALL',
'possible_keys' => NULL,
'key' => NULL,
'key_len' => NULL,
'ref' => NULL,
'rows' => 7784,
'filtered' => 0.89999997615814209,
'extra' => 'Using where',
) ]
- [ SQL ] SELECT * FROM `ob_article` WHERE `category_id` = 12 AND `cate` = 'program' AND `status` <> -1 ORDER BY rand() LIMIT 0,2 [ RunTime:0.015582s ]
- [ EXPLAIN : array (
'id' => 1,
'select_type' => 'SIMPLE',
'table' => 'ob_article',
'partitions' => NULL,
'type' => 'ALL',
'possible_keys' => NULL,
'key' => NULL,
'key_len' => NULL,
'ref' => NULL,
'rows' => 7784,
'filtered' => 0.89999997615814209,
'extra' => 'Using where; Using temporary; Using filesort',
) ]
- [ SQL ] SELECT COUNT(*) AS tp_count FROM `ob_article` WHERE `category_id` = 12 AND `tags` = 'python' AND `status` <> -1 LIMIT 1 [ RunTime:0.005161s ]
- [ EXPLAIN : array (
'id' => 1,
'select_type' => 'SIMPLE',
'table' => 'ob_article',
'partitions' => NULL,
'type' => 'ALL',
'possible_keys' => NULL,
'key' => NULL,
'key_len' => NULL,
'ref' => NULL,
'rows' => 7784,
'filtered' => 0.89999997615814209,
'extra' => 'Using where',
) ]
- [ SQL ] SELECT * FROM `ob_article` WHERE `category_id` = 12 AND `tags` = 'python' AND `status` <> -1 ORDER BY rand() LIMIT 0,2 [ RunTime:0.015718s ]
- [ EXPLAIN : array (
'id' => 1,
'select_type' => 'SIMPLE',
'table' => 'ob_article',
'partitions' => NULL,
'type' => 'ALL',
'possible_keys' => NULL,
'key' => NULL,
'key_len' => NULL,
'ref' => NULL,
'rows' => 7784,
'filtered' => 0.89999997615814209,
'extra' => 'Using where; Using temporary; Using filesort',
) ]
0.098157s
|