haralyzer package¶
Submodules¶
haralyzer.assets module¶
Provides all the main functional classes for analyzing HAR files
-
class
haralyzer.assets.
HarEntry
(entry: dict)[source]¶ Bases:
haralyzer.mixins.MimicDict
An object that represent one entry in a HAR Page
-
cache
¶ Returns: Cached objects Return type: str
Returns: Request and Response Cookies Return type: list
-
pageref
¶ Returns: Page for the entry Return type: str
-
port
¶ Returns: Port connection was made to Return type: int
-
secure
¶ Returns: Connection was secure Return type: bool
-
serverAddress
¶ Returns: IP Address of the server Return type: str
-
startTime
¶ Start time and date
Returns: Start time of entry Return type: Optional[datetime.datetime]
-
status
¶ Returns: HTTP Status Code Return type: int
-
time
¶ Returns: Time taken to complete entry Return type: int
-
timings
¶ Returns: Timing of the page load Return type: dict
-
url
¶ Returns: URL of Entry Return type: str
-
-
class
haralyzer.assets.
HarPage
(page_id: str, har_parser: Optional[haralyzer.assets.HarParser] = None, har_data: dict = None)[source]¶ Bases:
object
An object representing one page of a HAR resource
-
actual_page
¶ Returns the first entry object that does not have a redirect status, indicating that it is the actual page we care about (after redirects).
Returns: First entry of the page Return type: HarEntry
-
audio_load_time
¶ Audio load time
Returns: Load time for audio on a page Return type: int
-
audio_size
¶ Size of audio files from the page
Returns: Size of audio files on the page Return type: int
-
audio_size_trans
¶ Audio transfer size
Returns: Size of transfer data for audio Return type: int
-
content_load_time
¶ Content load time
Returns: Load time for all content Return type: int
-
css_load_time
¶ CSS load time
Returns: Load time for CSS on a page Return type: int
-
css_size
¶ Size of CSS files from the page
Returns: Size of CSS files on the page Return type: int
-
css_size_trans
¶ CSS transfer size
Returns: Size of transfer data for CSS Return type: int
-
duplicate_url_request
¶ Returns a dict of urls and its number of repetitions that are sent more than once
Returns: URLs and the amount of times they were duplicated Return type: dict
-
filter_entries
(request_type: str = None, content_type: str = None, status_code: str = None, http_version: str = None, load_time__gt: int = None, regex: bool = True) → List[haralyzer.assets.HarEntry][source]¶ Generate a list of entries with from criteria
Parameters: - request_type (str) – The request type (i.e. - GET or POST)
- content_type (str) – Regex to use for finding content type
- status_code (str) – The desired status code
- http_version (str) – HTTP version of request
- load_time__gt (int) – Load time in milliseconds. If provided, an entry whose load time is less than this value will be excluded from the results.
- regex (bool) – Whether to use regex or exact match.
Returns: List of entry objects based on the filtered criteria.
Return type: List[HarEntry]
-
get_load_time
(request_type: str = None, content_type: str = None, status_code: str = None, asynchronous: bool = True, **kwargs) → int[source]¶ This method can return the TOTAL load time for the assets or the ACTUAL load time, the difference being that the actual load time takes asynchronous transactions into account. So, if you want the total load time, set asynchronous=False.
EXAMPLE:
I want to know the load time for images on a page that has two images, each of which took 2 seconds to download, but the browser downloaded them at the same time.
self.get_load_time(content_types=[‘image’]) (returns 2) self.get_load_time(content_types=[‘image’], asynchronous=False) (returns 4)
Parameters: - request_type (str) – The request type (i.e. - GET or POST)
- content_type (str) – Regex to use for finding content type
- status_code (str) – The desired status code
- asynchronous (bool) – Whether to separate load times
Returns: Total load time
Return type: int
-
get_requests
¶ Returns a list of GET requests, each of which is a HarEntry object
Returns: All GET requests Return type: List[HarEntry]
-
static
get_total_size
(entries: List[HarEntry]) → int[source]¶ Returns the total size of a collection of entries.
Parameters: entries – list
of entries to calculate the total size of.Returns: Total size of entries Return type: int
-
static
get_total_size_trans
(entries: List[HarEntry]) → int[source]¶ Returns the total size of a collection of entries - transferred.
NOTE: use with har file generated with chrome-har-capturer
Parameters: entries – list
of entries to calculate the total size of.Returns: Total size of entries that was transferred Return type: int
-
hostname
¶ Returns: Hostname of the initial request Return type: str
-
html_load_time
¶ HTML load time
Returns: Load time for HTML on a page Return type: int
-
image_load_time
¶ Image load time
Returns: Load time for images on a page Return type: int
-
image_size
¶ Size of image files from the page
Returns: Size of image files on the page Return type: int
-
image_size_trans
¶ Image transfer size
Returns: Size of transfer data for images Return type: int
-
initial_load_time
¶ Initial load time
Returns: Initial load time of the page Return type: int
-
js_load_time
¶ JS load time
Returns: Load time for JS on a page Return type: int
-
js_size
¶ Size of JS files from the page
Returns: Size of JS files on the page Return type: int
-
js_size_trans
¶ JS transfer size
Returns: Size of transfer data for JS Return type: int
-
page_load_time
¶ Load time of the page
Returns: Load time for the page Return type: int
-
page_size
¶ Size of the page
Returns: Size of the page Return type: int
-
page_size_trans
¶ Page transfer size
Returns: Size of transfer data for the page Return type: int
-
post_requests
¶ Returns a list of POST requests, each of which is an HarEntry object
Returns: All POST requests Return type: List[HarEntry]
-
text_size
¶ Size of text files from the page
Returns: Size of text files on the page Return type: int
-
text_size_trans
¶ Text transfer size
Returns: Size of transfer data for text Return type: int
-
time_to_first_byte
¶ Returns: Time to first byte of the page request in ms Return type: int
-
url
¶ The absolute URL of the initial request.
Returns: URL of first request Return type: str
-
video_load_time
¶ Video load time
Returns: Load time for video on a page Return type: int
-
video_size
¶ Size of video files from the page
Returns: Size of video files on the page Return type: int
-
video_size_trans
¶ Video transfer size
Returns: Size of transfer data for images Return type: int
-
-
class
haralyzer.assets.
HarParser
(har_data: dict = None)[source]¶ Bases:
object
A Basic HAR parser that also adds helpful stuff for analyzing the performance of a web page.
-
browser
¶ Browser of Har File
Returns: Browser of the Har File Return type: str
-
static
create_asset_timeline
(asset_list: List[HarEntry]) → dict[source]¶ Returns a dict of the timeline for the requested assets. The key is a datetime object (down to the millisecond) of ANY time where at least one of the requested assets was loaded. The value is a list of ALL assets that were loading at that time.
Parameters: asset_list (List[HarEntry]) – The assets to create a timeline for. Returns: Milliseconds and assets that were loaded Return type: dict
-
creator
¶ Creator of Har File. Usually the same as the browser but not always
Returns: Program that created the HarFile Return type: str
-
static
from_file
(file: [<class 'str'>, <class 'bytes'>]) → haralyzer.assets.HarParser[source]¶ Function create a HarParser from a file path
Parameters: file ([str, bytes]) – Path to har file or bytes of har file Returns: HarParser Object :rtype HarParser
-
static
from_string
(data: [<class 'str'>, <class 'bytes'>])[source]¶ Function to load string or bytes as a HarParser
Parameters: data ([str, bytes]) – Input string or bytes Returns: HarParser Object :rtype HarParser
-
hostname
¶ Hostname of first page
Returns: Hostname of the first known page Return type: str
-
static
match_content_type
(entry: haralyzer.assets.HarEntry, content_type: str, regex: bool = True) → bool[source]¶ Matches the content type of a request using the mimeType metadata.
Parameters: - entry (HarEntry) – Entry to analyze
- content_type (str) – Regex to use for finding content type
- regex (bool) – Whether to use regex or exact match.
Returns: Mime type matches
Return type: bool
-
static
match_headers
(entry: haralyzer.assets.HarEntry, header_type: str, header: str, value: str, regex: bool = True) → bool[source]¶ Function to match headers.
Since the output of headers might use different case, like:
‘content-type’ vs ‘Content-Type’This function is case-insensitive
Parameters: - entry (HarEntry) – Entry to analyze
- header_type (str) – Header type. Valid values: ‘request’, or ‘response’
- header (str) – The header to search for
- value (str) – The value to search for
- regex (bool) – Whether to use regex or exact match
Returns: Whether a match was found
Return type: bool
-
static
match_http_version
(entry: haralyzer.assets.HarEntry, http_version: str, regex: bool = True) → bool[source]¶ Helper function that returns entries with a request type matching the given request_type argument.
Parameters: - entry (HarEntry) – Entry to analyze
- http_version (str) – HTTP version type to match
- regex (bool) – Whether to use a regex or string match
Returns: HTTP version matches
Return type: bool
-
static
match_request_type
(entry: haralyzer.assets.HarEntry, request_type: str, regex: bool = True) → bool[source]¶ Helper function that returns entries with a request type matching the given request_type argument.
Parameters: - entry (HarEntry) – Entry to analyze
- request_type (str) – Request type to match
- regex (bool) – Whether to use a regex or string match
Returns: Request method matches
Return type: bool
-
static
match_status_code
(entry: haralyzer.assets.HarEntry, status_code: str, regex: bool = True) → bool[source]¶ Helper function that returns entries with a status code matching then given status_code argument.
NOTE: This is doing a STRING comparison NOT NUMERICAL
Parameters: - entry (HarEntry) – Entry to analyze
- status_code (str) – Status code to search for
- regex (bool) – Whether to use a regex or string match
Returns: Status code matches
Return type: bool
-
pages
¶ This is a list of HarPage objects, each of which represents a page from the HAR file.
Returns: HarPages in the file Return type: List[HarPage]
-
version
¶ HAR Version
Returns: Version of HAR used Return type: str
-
haralyzer.errors module¶
Custom exceptions for good ol haralyzer.
haralyzer.http module¶
Creates the Request and Response sub class that are used by each entry
-
class
haralyzer.http.
Request
(entry: dict)[source]¶ Bases:
haralyzer.mixins.HttpTransaction
Request object for an HarEntry
-
accept
¶ Returns: HTTP Accept header Return type: str
-
bodySize
¶ Returns: Body size of the request Return type: int
-
cacheControl
¶ Returns: HTTP CacheControl header Return type: str
Returns: Cookies from the request Return type: list
-
encoding
¶ Returns: HTTP Accept-Encoding Header Return type: str
-
headersSize
¶ Returns: Headers size from the request Return type: int
-
host
¶ Returns: HTTP Host header Return type: str
-
httpVersion
¶ Returns: HTTP version used in the request Return type: str
-
language
¶ Returns: HTTP language header Return type: str
-
method
¶ Returns: HTTP method of the request Return type: str
-
mimeType
¶ Returns: Mime Type of request Return type: str
-
queryString
¶ Returns: Query string from the request Return type: list
-
text
¶ Returns: Request body Return type: str
-
url
¶ Returns: URL of the request Return type: str
-
userAgent
¶ Returns: User Agent Return type: str
-
-
class
haralyzer.http.
Response
(url: str, entry: dict)[source]¶ Bases:
haralyzer.mixins.HttpTransaction
Response object for a HarEntry
-
bodySize
¶ Returns: Body Size Return type: int
-
cacheControl
¶ Returns: Cache Control Header Return type: str
-
contentSecurityPolicy
¶ Returns: Content Security Policy Header Return type: str
-
contentSize
¶ Returns: Content Size Return type: int
-
contentType
¶ Returns: Content Type Return type: str
-
date
¶ Returns: Date of response Return type: str
-
headersSize
¶ Returns: Header size Return type: int
-
httpVersion
¶ Returns: HTTP Version Return type: str
-
lastModified
¶ Returns: Last modified time Return type: str
-
mimeType
¶ Returns: Mime Type of response Return type: str
-
redirectURL
¶ Returns: Redirect URL Return type: Optional[str]
-
status
¶ Returns: HTTP Status Return type: int
-
statusText
¶ Returns: HTTP Status Text Return type: str
-
text
¶ Returns: Response body Return type: str
-
textEncoding
¶ Returns: How the response body is encoded Return type: str
-
haralyzer.mixins module¶
Mixin Objects that allow for shared methods
-
class
haralyzer.mixins.
HttpTransaction
(entry: dict)[source]¶ Bases:
haralyzer.mixins.GetHeaders
,haralyzer.mixins.MimicDict
Class the represents a request or response
-
formatted
¶ Formatted HttpTransaction string for pretty print.
Returns: formatted string Return type: str
-
headers
¶ Headers from the entry
Returns: Headers from both request and response Return type: list
-
haralyzer.multihar module¶
Contains the mutlihar parse object
-
class
haralyzer.multihar.
MultiHarParser
(har_data, page_id=None, decimal_precision=0)[source]¶ Bases:
object
An object that represents multiple HAR files OF THE SAME CONTENT. It is used to gather overall statistical data in situations where you have multiple runs against the same web asset, which is common in performance testing.
-
asset_types
¶ Mimic the asset types stored in HarPage
Returns: Asset types from HarPage Return type: dict
-
audio_load_time
¶ Returns: Aggregate audio load time for all pages. Can be an int or float depending on the self.decimal_precision Return type: int, float
-
css_load_time
¶ Returns: Aggregate css load time for all pages. Can be an int or float depending on the self.decimal_precision Return type: int, float
-
get_load_times
(asset_type: str) → list[source]¶ Just a list of the load times of a certain asset type for each page
Parameters: asset_type (str) – The asset type to return load times for Returns: List of load times Return type: list
-
get_stdev
(asset_type: str) → Union[int, float][source]¶ Returns the standard deviation for a set of a certain asset type.
Parameters: asset_type (str) – The asset type to calculate standard deviation for. Returns: Standard deviation, which can be an int or float depending on the self.decimal_precision Return type: int, float
-
html_load_time
¶ Returns: Aggregate html load time for all pages. Can be an int or float depending on the self.decimal_precision Return type: int, float
-
image_load_time
¶ Returns: Aggregate image load time for all pages. Can be an int or float depending on the self.decimal_precision Return type: int, float
-
js_load_time
¶ Returns: Aggregate javascript load time. Can be an int or float depending on the self.decimal_precision Return type: int, float
-
page_load_time
¶ Returns: Average total load time for all runs (not weighted). Can be an int or float depending on the self.decimal_precision Return type: int, float
-
pages
¶ Aggregate pages of all the parser objects.
Returns: All the pages from parsers Return type: List[haralyzer.assets.HarPage]
-
time_to_first_byte
¶ Returns: The aggregate time to first byte for all pages. Can be an int or float depending on the self.decimal_precision Return type: int, float
-
video_load_time
¶ Returns: Aggregate video load time for all pages. Can be an int or float depending on the self.decimal_precision Return type: int, float
-