翻译文章,原文:Writing a scanner to find reflected XSS vulnerabilities — Part 1[1]
代码地址:https://github.com/akhil-reni/xsstutorial , 可以直接看代码,感觉文章本身有点乱,看完才觉得好多地方说的很乱
2016年,我从事一个与burp suite非常相似的Web应用程序扫描程序项目,该项目代理来自浏览器或Selenium自动化工具的HTTP请求,并将它们发送到不同的模块/插件以进行漏洞扫描。我们构建的架构是相当模块化的,大多数应用程序都是用Python编写的,而前端则是使用Django和Celery编写的,用于异步任务。
我写这篇博客的目的是帮助安全工程师为自己或社区编写漏洞扫描程序。
在构建任何东西时,我们首先需要坚持基础知识并弄清楚以下几点:
•它如何运作? 创建一个简单的流程图来说明扫描器的工作流程,它可能是包括:需要什么输入,怎么分析这些数据,最终输出什么数据。
•使用什么技术实现? 选择合适的技术非常重要,在选择一种技术时,应当了解需要的功能库以及如何根据需要扩展他们。但最重要的是你熟悉它。如果我用20小时编写的GoLang代码和我用5小时编写的Python代码,在功能输出上仅仅是好一些,那么我会义无反顾的使用PYthon编写代码。
让我们开始吧,因为我之前的项目是使用Python,所以我会坚持下去。首先,我们先创建一个功能图。需要了解了解和如何标识他,请看文章:reflected cross-site scripting vulnerability[2]。
整个扫描器可以分成以下几个模块:
•原始HTTP请求解析器
•初始探测器
•上下文分析器
•Payload生成器
•Payload验证
首先创建每个模块,然后最后将它们整合在一起。
创建一个python virtualenv
pip3 install virtualenvpython3 -m virtualenv xss_env
激活virtualenv
cd xss_env/Scripts && activate
在virtualenv文件夹之外创建一个新文件夹
mkdir rxss
现在我们已经设置好环境,我们开始编写一些代码。第一个模块将是原始HTTP请求解析器,该解析器从文件中获取输入并转换为请求对象。为此,我们将使用python3中的现有http库
from __future__ import absolute_import, unicode_literalsfrom http.server import BaseHTTPRequestHandlerfrom io import BytesIOclass HTTPRequest(BaseHTTPRequestHandler): def __init__(self, request_text): self.rfile = BytesIO(request_text) self.raw_requestline = self.rfile.readline() self.error_code = self.error_message = None self.parse_request() def send_error(self, code, message): self.error_code = code self.error_message = message
上面的类接受原始的HTTP字符串并将其转换为请求对象。
POST /search.php?test=query HTTP/1.1Host: testphp.vulnweb.comUser-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:75.0) Gecko/20100101 Firefox/75.0Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8Accept-Language: en-US,en;q=0.5Accept-Encoding: gzip, deflateContent-Type: application/x-www-form-urlencodedContent-Length: 27Origin: http://testphp.vulnweb.comConnection: closeReferer: http://testphp.vulnweb.com/search.php?test=queryUpgrade-Insecure-Requests: 1searchFor=asdas&goButton=go
将以上请求保存在request.txt中
from __future__ import absolute_import, unicode_literalsfrom http.server import BaseHTTPRequestHandlerfrom io import BytesIOclass HTTPRequest(BaseHTTPRequestHandler): def __init__(self, request_text): self.rfile = BytesIO(request_text) self.raw_requestline = self.rfile.readline() self.error_code = self.error_message = None self.parse_request() def send_error(self, code, message): self.error_code = code self.error_message = messagewith open("requests.txt", "rb") as f: request = HTTPRequest(f.read()) if not request.error_code: print(request.command) # prints method print(request.path) # prints request.path print(request.headers.keys()) # prints requests headers print(request.headers['host']) # prints requests host content_len = int(request.headers.get('Content-Length')) print(request.rfile.read(content_len)) # prints request body
将以上代码另存为request_parser.py并使用以下命令执行
python3 request_parser.py
POST /search.php?test=query['Host', 'User-Agent', 'Accept', 'Accept-Language', 'Accept-Encoding', 'Content-Type', 'Content-Length', 'Origin', 'Connection', 'Referer', 'Upgrade-Insecure-Requests']testphp.vulnweb.comb'searchFor=asdas&goButton=go'
我们已经成功解析了一个HTTP请求。现在剩下的最后一件事是将主体和请求参数转换为DICT,以便我们可以轻松地解析和添加自己的有效负载。
完整的请求解析代码可以在这里找到:https://gist.github.com/akhil-reni/5c20f40729179858570ad1ffdf4502f3
from __future__ import absolute_import, unicode_literalsfrom http.server import BaseHTTPRequestHandlerfrom io import BytesIOfrom urllib import parseclass Request: def __init__(self): self.headers = None self.params = None self.data = None self.path = None def replace(self, string, payload): for k, v in self.headers.items(): k.replace(string, payload) v.replace(string, payload) for k, v in self.params.items(): self.params[k] = self.params[k].replace(string, payload) for k, v in self.data.items(): self.data[k] = self.data[k].replace(string, payload) print(self.data)class RequestParser(object): def __init__(self, request_text): self.request = Request() try: self.raw_request = HTTPRequest(request_text) if self.raw_request.error_code: raise Exception("failed parsing request") self.request.method = self.raw_request.command self.request.path = self.construct_path() self.request.headers = self.raw_request.headers self.request.data = self.convert(self.construct_data()) self.request.params = self.convert(self.construct_params()) except Exception as e: raise e def convert(self, data): if isinstance(data, bytes): return data.decode() if isinstance(data, (str, int)): return str(data) if isinstance(data, dict): return dict(map(self.convert, data.items())) if isinstance(data, tuple): return tuple(map(self.convert, data)) if isinstance(data, list): return list(map(self.convert, data)) if isinstance(data, set): return set(map(self.convert, data)) def construct_path(self): return parse.urlsplit(self.raw_request.path).path def construct_data(self): return dict(parse.parse_qsl(self.raw_request.rfile.read(int(self.raw_request.headers.get('content-length'))))) def construct_params(self): return dict(parse.parse_qsl(parse.urlsplit(self.raw_request.path).query))with open("requests.txt", "rb") as f: parser = RequestParser(f.read()) print(parser.request.method) # prints method print(parser.request.path) # prints request.path print(parser.request.headers) # prints requests headers print(parser.request.data) # prints requests body print(parser.request.params) # prints requests params
解析完成后,我们需要找到一种在请求参数和帖子正文中插入探针的方法,并检查它是否在响应中反映出来。为此,我们将在python中使用请求包。
pip3 install requests
创建一个新的文件create_insertions.py,代码如下:https://gist.github.com/akhil-reni/ed890e7fb7d90a7581c3ce380744b609
import copyclass GetInsertionPoints: def __init__(self, request): self.request = request self.requests = [] self.params(append=True) self.body(append=True) def params(self, append: bool = False) -> None: if self.request.params: for q in self.request.params: request = copy.deepcopy(self.request) if append: request.params[q] = str(request.params[q])+" teyascan" else: request.params[q] = "teyascan" request.insertion = q request.iplace = 'params' self.requests.append(request) def body(self, append: bool = False) -> None: if self.request.data: for q in self.request.data: request = copy.deepcopy(self.request) if append: request.data[q] = str(request.data[q])+" teyascan" else: request.data[q] = "teyascan" request.insertion = q request.iplace = 'body' self.requests.append(request)with open("requests.txt", "rb") as f: parser = RequestParser(f.read()) print(parser.request.method) # prints method print(parser.request.path) # prints request.path print(parser.request.headers) # prints requests headers print(parser.request.data) # prints requests body print(parser.request.params) # prints requests params i_p = GetInsertionPoints(parser.request) print(i_p.requests)
上面的代码解析参数和正文,以使用探针作为有效负载创建请求对象的列表。
python3 create_insertions.py[<__main__.httprequest object at>, <__main__.httprequest object at>, <__main__.httprequest object at>]
现在&#xff0c;我们发送每个请求&#xff0c;并检查响应中反映了哪个参数值。
import requestsdef send_request(request, scheme): url &#61; "{}://{}{}".format(scheme, request.headers.get("host"), request.path) req &#61; requests.Request(request.method, url, params&#61;request.params, data&#61;request.data, headers&#61;request.headers) r &#61; req.prepare() s &#61; requests.Session() response &#61; s.send(r, allow_redirects&#61;False, verify&#61;False) return responsewith open("requests.txt", "rb") as f: parser &#61; RequestParser(f.read()) i_p &#61; GetInsertionPoints(parser.request) for request in i_p.requests: response &#61; send_request(request, "http") if "teyascan" in response.text: print("probe reflection found in "&#43;request.insertion)
输出将是这样的:
python .est.pyprobe reflection found in searchFor
HTML标签&#xff1a; 必须使用<>字符来构造有效负载。
HTML属性名字&#xff1a; 空格和&#61;是必需的&#xff0c;"和&#39;是可选的
HTML属性值&#xff1a; 直接Payload或是用",&#39;来构造有效载荷
HTML文本节点&#xff1a; 必须使用<>字符才能转义文本区域并构造有效内容
HTML注释&#xff1a; <>!必须使用字符来转义注释并构造有效内容
样式: 必须使用<>字符才能转义样式并构造有效内容
样式属性&#xff1a; &#xff0c;"字符需要转义文本区域并构造有效内容
Href属性&#xff1a; 需要直接载荷或"来逃逸文本区域并构造载荷
JS节点&#xff1a; 是转义脚本所必需的。或其他特殊字符可转义JS变量或函数。
注意&#xff1a;我们并未构建涵盖所有上下文的高级扫描仪。
为了编写上下文分析器&#xff0c;我们将使用一个名为LXML的程序包&#xff0c;该程序包将HTML解析为XML树。
pip3 install lxml
我们将编写一个接收原始HTML响应的类&#xff0c;将其转换为XML树&#xff0c;搜索字符串&#xff0c;然后返回上下文列表。
例如&#xff0c;执行以下代码:
from lxml import htmlstring &#61; "
teyascan
"search_string &#61; "teyascan"page_html_tree &#61; html.fromstring(string)xpath &#61; &#39;//*[contains(text(),&#39;&#39; &#43; search_string &#43; &#39;&#39;)]&#39;n &#61; page_html_tree.xpath(xpath)if len(n): print("INPUT IS REFLECTED BACK INSIDE HTML TAG CONTEXT")
您将看到类似以下的输出:
(xss_env) C:甥敳獲hungrysoulDownloadseyaeya>python test.pyprobe reflection found in searchFor{&#39;payload&#39;: &#39;teyascan&#39;, &#39;contexts&#39;: [{&#39;type&#39;: &#39;text&#39;, &#39;count&#39;: 1}]}(xss_env) C:甥敳獲hungrysoulDownloadseyaeya>
在上面的代码中&#xff0c;您可以看到原始HTML字符串已被解析并转换为XML树。稍后&#xff0c;我们使用正则表达式搜索XML树&#xff0c;以找到反映了字符串的上下文。
分析代码地址&#xff1a;https://gist.github.com/akhil-reni/aa001c76748b1dddb3d50d141098905e#file-context_analyzer-py
执行代码&#xff1a;
(xss_env) C:甥敳獲hungrysoulDownloadseyaeya>python test.pyprobe reflection found in searchFor{&#39;payload&#39;: &#39;teyascan&#39;, &#39;contexts&#39;: [{&#39;type&#39;: &#39;text&#39;, &#39;count&#39;: 1}]}(xss_env) C:甥敳獲hungrysoulDownloadseyaeya>
现在&#xff0c;我们已经成功创建了一个上下文分析器。剩下的就是根据上下文创建有效负载并确认这些有效负载。
对于这一部分&#xff0c;我们将不使用有效负载列表或其他任何东西&#xff0c;而只是使扫描仪足够智能。
首先创建一个名为payload_generator.py的新文件&#xff0c;然后创建函数payload_generator。该函数获取上下文并返回带有正则表达式的列表&#xff0c;以在XML树和有效负载中查找。
代码如下&#xff1a;https://gist.github.com/akhil-reni/d60a88c64f3bd02690e0f19cb3752458#file-payload_generator-py
执行以下代码&#xff1a;
with open("requests.txt", "rb") as f: parser &#61; RequestParser(f.read()) i_p &#61; GetInsertionPoints(parser.request) for request in i_p.requests: response &#61; send_request(request, "http") if "teyascan" in response.text: print("probe reflection found in "&#43;request.insertion) contexts &#61; ContextAnalyzer.get_contexts(response.text, "teyascan") final_payloads &#61; [] for context in contexts["contexts"]: print(context) payloads &#61; payload_generator(context[&#39;type&#39;]) final_payloads.extend(payloads) print(final_payloads)
输出&#xff1a;
(xss_env) C:甥敳獲hungrysoulDownloadseyaeya>python test.pyprobe reflection found in searchFor{&#39;type&#39;: &#39;htmltag&#39;, &#39;count&#39;: 1}[{&#39;payload&#39;: &#39;&#39;, &#39;find&#39;: &#39;//svg[&#64;onload[contains(.,812132)]]&#39;}]
我们已经根据上下文成功创建了有效负载。接下来的事情是发送带有有效负载的请求&#xff0c;并确认有效负载是否成功。
为此&#xff0c;我们需要发送一个HTTP请求&#xff0c;但是这次不是探测字符串&#xff0c;而是与有效载荷一起发送。我们首先使用Deepcopy复制请求。
dup &#61; copy.deepcopy(request)
然后使用以下命令替换请求参数&#xff0c;标头和正文中的“teyascan”
def replace(request, string, payload): for k, v in request.headers.items(): k.replace(string, payload) v.replace(string, payload) for k, v in request.params.items(): request.params[k] &#61; request.params[k].replace(string, payload) for k, v in self.data.items(): request.data[k] &#61; request.data[k].replace(string, payload)
完成后&#xff0c;我们可以使用在第一篇文章中创建的send_request函数发送请求对象。
with open("requests.txt", "rb") as f: parser &#61; RequestParser(f.read()) i_p &#61; GetInsertionPoints(parser.request) for request in i_p.requests: response &#61; send_request(request, "http") if "teyascan" in response.text: print("probe reflection found in "&#43;request.insertion) contexts &#61; ContextAnalyzer.get_contexts(response.text, "teyascan") for context in contexts["contexts"]: print(context) payloads &#61; payloadGenerator(context[&#39;type&#39;]) for payload in payloads: dup &#61; copy.deepcopy(request) dup.replace("teyascan", payload[&#39;payload&#39;]) response &#61; send_request(dup, "http") page_html_tree &#61; html.fromstring(response.text) count &#61; page_html_tree.xpath(payload[&#39;find&#39;]) if len(count): print("request vulnerable") print(dup.headers) http &#61; MakeRawHTTP(dup) print(http.rawRequest)
输出&#xff1a;
(xss_env) C:甥敳獲hungrysoulDownloadseyaeya>python test.pyprobe reflection found in searchForVULNERABLE TO XSSPOST /search.php HTTP/1.1Host: testphp.vulnweb.comUser-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:75.0) Gecko/20100101 Firefox/75.0Accept: text/html,application/xhtml&#43;xml,application/xml;q&#61;0.9,image/webp,*/*;q&#61;0.8Accept-Language: en-US,en;q&#61;0.5Accept-Encoding: gzip, deflateContent-Type: application/x-www-form-urlencodedContent-Length: 27Origin: http://testphp.vulnweb.comConnection: closeReferer: http://testphp.vulnweb.com/search.php?test&#61;queryUpgrade-Insecure-Requests: 1searchFor&#61;asdas &goButton&#61;go&
在浏览器中重复请求以确认漏洞
我们已经成功地自动找到了XSS
教程代码可以在https://github.com/akhil-reni/xsstutorial中找到
这个是我后加的总结&#xff0c;感觉写的有点乱&#xff0c;文章没必必须要全读&#xff0c;可以直接看看代码&#xff0c;是不是有参考价值
[1] Writing a scanner to find reflected XSS vulnerabilities — Part 1: https://medium.com/&#64;hungry.soul/writing-a-scanner-to-find-reflected-xss-vulnerabilities-part-1-5dd6de7d1a35[2] reflected cross-site scripting vulnerability: https://portswigger.net/web-security/cross-site-scripting/reflected