Writing a Custom Burp Suite Extension (Bambda) for Automated API Parameter Discovery
Automating the discovery of API parameters is a critical task for any penetration tester aiming to thoroughly assess web applications and their underlying services. Manually identifying every potential parameter across a sprawling API surface is time-consuming and prone to human error. A custom Burp Suite extension, often referred to as a "Bambda" when written in Python (Jython), can intercept and parse HTTP traffic on the fly, systematically extracting and cataloging API parameters from both requests and responses. This approach significantly enhances reconnaissance efforts, ensuring a more complete understanding of the attack surface.
The Case for Automated API Parameter Discovery
Modern web applications increasingly rely on complex APIs, often exposing numerous endpoints with a multitude of parameters. These parameters, whether in the URL query string, request body, or HTTP headers, define the application's functionality and are prime targets for manipulation and vulnerability exploitation. Without a comprehensive list of parameters, security testing efforts remain incomplete. Manual enumeration falls short quickly as API complexity grows. An automated solution provides consistency and speed, allowing pentesters to focus on analyzing the impact of parameters rather than merely finding them. Automating this discovery process with a custom Bambda extension complements broader security testing initiatives, especially when integrating with platforms like
Secably for comprehensive vulnerability scanning and automated web security assessments.
Burp Extensibility: Jython and Bambda
Burp Suite's extensibility framework allows security professionals to tailor its functionality to specific testing needs. While extensions can be written in Java, Ruby, or Python, Jython (a Python implementation that runs on the Java Virtual Machine) offers a rapid development cycle, making it a popular choice for custom Burp tools. These Python-based extensions are colloquially known as "Bambdas." The core of a Bambda that interacts with HTTP traffic involves implementing the
IBurpExtender and
IHttpListener interfaces.
Setting Up Your Burp Extension Environment
Before writing code, ensure your Burp Suite environment is configured for Jython:
1. **Download Jython Standalone JAR**: Obtain the latest Jython standalone JAR file from the official Jython website.
2. **Configure Burp**: In Burp Suite, navigate to `Extender > Options`. Under the "Python Environment" section, click "Select file..." and point it to your downloaded Jython standalone JAR.
3. **Load Your Extension**: Once your Python script is ready, go to `Extender > Extensions > Add`. Select "Python" as the extension type and choose your `.py` file. Any output from `print()` statements in your script will appear in the "Output" tab.
Core Components of a Bambda Listener
A minimal Burp extension that listens to HTTP traffic must implement `IBurpExtender` and `IHttpListener`. The `registerExtenderCallbacks` method is where your extension initializes itself, sets its name, and registers the HTTP listener. The `processHttpMessage` method is the workhorse, called for every HTTP request and response passing through Burp.
from burp import IBurpExtender
from burp import IHttpListener
import json
import re
from urlparse import urlparse, parse_qsl
class BurpExtender(IBurpExtender, IHttpListener):
def registerExtenderCallbacks(self, callbacks):
self._callbacks = callbacks
self._helpers = callbacks.getHelpers()
self._callbacks.setExtensionName("API Parameter Discoverer")
self._callbacks.registerHttpListener(self)
self.discovered_params = set()
print("[+] API Parameter Discoverer loaded.")
def processHttpMessage(self, toolFlag, messageIsRequest, messageInfo):
# We only care about requests for parameter discovery
if messageIsRequest:
self.analyze_request_for_params(messageInfo)
# Optionally, you could analyze responses for parameters in links or embedded JSON
# else:
# self.analyze_response_for_params(messageInfo)
def analyze_request_for_params(self, messageInfo):
# Implementation for analyzing request will go here
pass
# def analyze_response_for_params(self, messageInfo):
# # Implementation for analyzing response will go here
# pass
In the code above:
* `IBurpExtender` is the fundamental interface for all extensions.
* `IHttpListener` enables the extension to receive HTTP requests and responses.
* `_callbacks` provides access to Burp's API.
* `_helpers` offers utility methods for parsing and building HTTP messages.
* `setExtensionName` gives your extension a display name in the Extender tab.
* `registerHttpListener(self)` registers this class instance to receive HTTP messages.
* `discovered_params` is a Python `set` to store unique parameter names, preventing duplicates.
Extracting Parameters: A Deep Dive
The core logic resides within `analyze_request_for_params` (and potentially `analyze_response_for_params`). We'll focus on request analysis here, covering URL query parameters, JSON body parameters, and form-encoded body parameters.
URL Query Parameters
Extracting query parameters involves parsing the URL. Burp's `IRequestInfo` object, obtained via `_helpers.analyzeRequest`, provides convenient methods. For Python's `urlparse` module is also highly effective.
def extract_url_params(self, requestInfo):
http_service = requestInfo.getUrl()
parsed_url = urlparse(str(http_service))
query_params = parse_qsl(parsed_url.query)
for name, value in query_params:
self.add_param(name, "URL_QUERY")
The `urlparse` function breaks down a URL into components, and `parse_qsl` specifically parses the query string into key-value pairs.
JSON Body Parameters
For API requests, JSON is a prevalent format. Extracting parameters from a JSON body requires parsing the body content and recursively identifying all keys.
def extract_json_params(self, request_body):
try:
# Decode byte array to string, then parse JSON
json_data = self._helpers.bytesToString(request_body)
parsed_json = json.loads(json_data)
self._recursive_json_param_extract(parsed_json)
except ValueError:
# Not valid JSON, or other parsing error
pass
def _recursive_json_param_extract(self, obj):
if isinstance(obj, dict):
for key, value in obj.items():
self.add_param(key, "JSON_BODY")
self._recursive_json_param_extract(value)
elif isinstance(obj, list):
for item in obj:
self._recursive_json_param_extract(item)
Here, `json.loads` converts the JSON string to a Python dictionary or list, then a recursive function traverses the structure to find all keys.
Form-Encoded Body Parameters
Traditional HTML form submissions or API requests using `application/x-www-form-urlencoded` require a different parsing approach. Burp's helper functions or `urlparse.parse_qsl` can handle this.
def extract_form_params(self, request_body):
# For form-urlencoded, the body is essentially a query string
form_params = parse_qsl(self._helpers.bytesToString(request_body))
for name, value in form_params:
self.add_param(name, "FORM_BODY")
Headers
While HTTP headers aren't typically "parameters" in the same mutable sense as query or body parameters, sensitive information or API tokens are often passed in headers (e.g., `Authorization`, `X-API-Key`). Collecting header names can still be valuable.
def extract_header_names(self, requestInfo):
headers = requestInfo.getHeaders()
for header_line in headers:
# Headers are typically "Name: Value"
if ':' in header_line:
name = header_line.split(':', 1).strip()
self.add_param(name, "HEADER")
The `requestInfo.getHeaders()` method returns a list of header strings.
Storing and Presenting Findings
Maintaining a unique list of discovered parameters is essential. A Python `set` is perfect for this. Periodically printing the unique parameters to Burp's extension output (`stdout`) keeps the tester informed.
# ... (inside BurpExtender class) ...
def add_param(self, name, param_type):
param_identifier = f"{name} ({param_type})"
if param_identifier not in self.discovered_params:
self.discovered_params.add(param_identifier)
print(f"Discovered: {param_identifier}")
def analyze_request_for_params(self, messageInfo):
requestInfo = self._helpers.analyzeRequest(messageInfo)
# Extract URL parameters
self.extract_url_params(requestInfo)
# Extract Header names
self.extract_header_names(requestInfo)
# Extract Body parameters based on content type
body_bytes = messageInfo.getRequest()[requestInfo.getBodyOffset():]
content_type_header = self._helpers.getRequestHeaders(messageInfo) # Simplified, usually need to iterate
# More robust content-type extraction
content_type = ""
for header in self._helpers.analyzeRequest(messageInfo).getHeaders():
if header.lower().startswith('content-type:'):
content_type = header.split(':', 1).strip().lower()
break
if body_bytes:
if "json" in content_type:
self.extract_json_params(body_bytes)
elif "x-www-form-urlencoded" in content_type:
self.extract_form_params(body_bytes)
# Add other content types as needed (e.g., multipart/form-data, XML)
The Bambda in Action: Full Code Example
This combined script demonstrates a functional API parameter discovery extension.
from burp import IBurpExtender
from burp import IHttpListener
import json
import re
from urlparse import urlparse, parse_qsl
class BurpExtender(IBurpExtender, IHttpListener):
def registerExtenderCallbacks(self, callbacks):
self._callbacks = callbacks
self._helpers = callbacks.getHelpers()
self._callbacks.setExtensionName("API Parameter Discoverer")
self._callbacks.registerHttpListener(self)
self.discovered_params = set()
print("[+] API Parameter Discoverer loaded. Monitoring HTTP traffic for parameters.")
def processHttpMessage(self, toolFlag, messageIsRequest, messageInfo):
# Only process requests from the Proxy or Repeater tools for active discovery
# This prevents flooding the output with Scanner or Spider traffic unless desired
if messageIsRequest and toolFlag in (self._callbacks.TOOL_PROXY, self._callbacks.TOOL_REPEATER):
self.analyze_request_for_params(messageInfo)
def analyze_request_for_params(self, messageInfo):
requestInfo = self._helpers.analyzeRequest(messageInfo.getRequest())
# 1. Extract URL Query Parameters
self.extract_url_params(requestInfo)
# 2. Extract Header Names (useful for API keys, custom headers)
self.extract_header_names(requestInfo)
# 3. Extract Body Parameters based on Content-Type
request_body_bytes = messageInfo.getRequest()[requestInfo.getBodyOffset():]
if request_body_bytes:
content_type = ""
for header in requestInfo.getHeaders():
if header.lower().startswith('content-type:'):
content_type = header.split(':', 1).strip().lower()
break
if "json" in content_type:
self.extract_json_params(request_body_bytes)
elif "x-www-form-urlencoded" in content_type:
self.extract_form_params(request_body_bytes)
# Add more handlers for other content types (e.g., XML, multipart/form-data) here
# For XML, you'd need an XML parser (e.g., from xml.etree.ElementTree)
# For multipart, it's more complex, often involving regex or specific libraries
def extract_url_params(self, requestInfo):
http_service = requestInfo.getUrl()
parsed_url = urlparse(str(http_service))
query_params = parse_qsl(parsed_url.query)
for name, value in query_params:
self.add_param(name, "URL_QUERY")
# Basic path parameter detection (heuristic: common placeholders like {id})
path_segments = parsed_url.path.split('/')
for segment in path_segments:
if re.match(r'^[a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12}$', segment, re.IGNORECASE): # UUID
self.add_param("UUID_PATH_PARAM", "PATH")
elif re.match(r'^\d+$', segment): # Numeric ID
self.add_param("NUMERIC_PATH_PARAM", "PATH")
# More sophisticated path parameter detection could involve analyzing Swagger/OpenAPI specs
def extract_json_params(self, request_body_bytes):
try:
json_data = self._helpers.bytesToString(request_body_bytes)
parsed_json = json.loads(json_data)
self._recursive_json_param_extract(parsed_json)
except ValueError:
pass # Not valid JSON
def _recursive_json_param_extract(self, obj):
if isinstance(obj, dict):
for key, value in obj.items():
self.add_param(key, "JSON_BODY")
self._recursive_json_param_extract(value)
elif isinstance(obj, list):
for item in obj:
self._recursive_json_param_extract(item)
def extract_form_params(self, request_body_bytes):
form_params = parse_qsl(self._helpers.bytesToString(request_body_bytes))
for name, value in form_params:
self.add_param(name, "FORM_BODY")
def extract_header_names(self, requestInfo):
headers = requestInfo.getHeaders()
for header_line in headers:
if ':' in header_line:
name = header_line.split(':', 1).strip()
self.add_param(name, "HEADER")
def add_param(self, name, param_type):
param_identifier = f"{name} ({param_type})"
if param_identifier not in self.discovered_params:
self.discovered_params.add(param_identifier)
self._callbacks.issueAlert(f"New Parameter Discovered: {param_identifier}")
print(f"Discovered: {param_identifier}")
This script logs newly discovered parameters to Burp's Extender output tab and also issues an alert, providing immediate feedback.
Refinements and Next Steps
This basic framework can be extended significantly. Consider:
* **Response Analysis**: Extracting parameters from JSON or XML responses, particularly when they contain links or dynamic data that might hint at new API endpoints or parameters.
* **Context Menus**: Add a context menu item to send a selected request to Repeater or Intruder with discovered parameters pre-filled.
* **Filtering**: Implement host-based or path-based filtering to focus on specific targets.
* **Passive Scanning**: Integrate with Burp's `IScannerCheck` to report identified parameters as passive scan issues.
* **Output Formats**: Save discovered parameters to a file in a structured format (e.g., CSV, JSON) for later use by other tools.
* **Advanced Parameter Types**: Heuristic detection for path parameters (e.g., UUIDs, numeric IDs) and more complex body types.
This Bambda works effectively whether Burp is directly proxying traffic or operating in a more complex setup involving proxy chains, such as those facilitated by
GProxy to route traffic through various intermediary points for advanced reconnaissance or obfuscation. Such setups are common in advanced penetration testing scenarios where traffic needs to be routed through multiple layers before reaching the target.