WEEKLY REFLCTION- week3

The Summary and Reflection of Week 3 [Web Scraping].

Core Reading


Main Tasks of Week 3

Web Scraper

[webscraper]

[process]


Reflection

During this week's workshop, I selected the BBC Player website for web code analysis, with the aim of identifying hidden code beyond the obvious headings and images. However, my unfamiliarity with the code persisted, rendering the search process exceedingly time-consuming.

Nevertheless, through repeated practice and observation, I was delighted to note an improvement: compared to last week, I could now more quickly distinguish between different HTML codes. For instance, I recognised the varying methods for inserting images ( e.g., using to embed an image, or employing to insert a link to an image or video).

Additionally, the workshop covered web scraping techniques. To facilitate web scraping, a ‘Web Scraper’ extension was added to the Google browser.This AI-assisted tool has indeed demonstrated its efficiency, quickly extracting necessary data from web pages and generating CSV tables.

However, its limitations soon appeared:

  • as an AI-assisted tool, it encountered errors when scraping data and could only extract specific types of content ( i.e., it might only retrieve the raw HTML code of images or only capture webpage titles).

I recognise that undertaking more precise and complex data scraping tasks, such as scraping with Java, requires a more solid foundation in relevant knowledge.

Creat By Qixuan Wei