Skip to main content

processing.html

HTML processing functions

def extract_hyperlinks(soup: BeautifulSoup,
base_url: str) -> list[tuple[str, str]]

Extract hyperlinks from a BeautifulSoup object

Arguments:

  • soup BeautifulSoup - The BeautifulSoup object
  • base_url str - The base URL

Returns:

List[Tuple[str, str]]: The extracted hyperlinks

def format_hyperlinks(hyperlinks: list[tuple[str, str]]) -> list[str]

Format hyperlinks to be displayed to the user

Arguments:

  • hyperlinks List[Tuple[str, str]] - The hyperlinks to format

Returns:

  • List[str] - The formatted hyperlinks