IT Blog

  • Blog
  • Technology
    • Technology
    • Architecture
    • CMS
    • CRM
    • Web
    • DotNET
    • Python
    • Database
    • BI
    • Program Language
  • Users
    • Login
    • Register
    • Forgot Password?
  • ENEN
    • 中文中文
    • ENEN
Experience IT
In a World of Technology, People Make the Difference.
  1. Home
  2. Technology
  3. Program Language
  4. Python
  5. Python - Crawling images from json file

Python - Crawling images from json file

2020-04-22 1018 Views 0 Like 0 Comments

Table of Contents

  • Analysis
  • Coding
  • Result

Analysis

url: https://www.luscious.net/albums/cassie-laine_348936/

load images with ajax. It returns json file

url: https://api.luscious.net/graphql/nobatch/?operationName=AlbumListOwnPictures&query=+query+AlbumListOwnPictures%28%24input%3A+PictureListInput%21%29+%7B+picture+%7B+list%28input%3A+%24input%29+%7B+info+%7B+...FacetCollectionInfo+%7D+items+%7B+...PictureStandardWithoutAlbum+%7D+%7D+%7D+%7D+fragment+FacetCollectionInfo+on+FacetCollectionInfo+%7B+page+has_next_page+has_previous_page+total_items+total_pages+items_per_page+url_complete+%7D+fragment+PictureStandardWithoutAlbum+on+Picture+%7B+__typename+id+title+description+created+like_status+number_of_comments+number_of_favorites+moderation_status+width+height+resolution+aspect_ratio+url_to_original+url_to_video+is_animated+position+tags+%7B+category+text+url+%7D+permissions+url+thumbnails+%7B+width+height+size+url+%7D+%7D+&variables=%7B%22input%22%3A%7B%22filters%22%3A%5B%7B%22name%22%3A%22album_id%22%2C%22value%22%3A%22348936%22%7D%5D%2C%22display%22%3A%22rating_all_time%22%2C%22page%22%3A1%7D%7D

Python - Crawling images from json file

Coding

json structure below. We need the highest resolution images.

Python - Crawling images from json file

import requests
import json
from urllib.request import urlretrieve

def download(url):
   header = { "User_Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.149 Safari/537.36" }
   html = requests.get(url, headers=header).text  #return json file
   jobj = json.loads(html)

   items = jobj['data']['picture']['list']['items']
   for item in items:
      img = th['url_to_original']
      filename = img[img.rfind("/") + 1:]
      filepath = 'd:/img/'+filename
      print(filepath)
      urlretrieve(img, filepath)

url = "https://api.luscious.net/graphql/nobatch/?operationName=AlbumListOwnPictures&query=+query+AlbumListOwnPictures%28%24input%3A+PictureListInput%21%29+%7B+picture+%7B+list%28input%3A+%24input%29+%7B+info+%7B+...FacetCollectionInfo+%7D+items+%7B+...PictureStandardWithoutAlbum+%7D+%7D+%7D+%7D+fragment+FacetCollectionInfo+on+FacetCollectionInfo+%7B+page+has_next_page+has_previous_page+total_items+total_pages+items_per_page+url_complete+%7D+fragment+PictureStandardWithoutAlbum+on+Picture+%7B+__typename+id+title+description+created+like_status+number_of_comments+number_of_favorites+moderation_status+width+height+resolution+aspect_ratio+url_to_original+url_to_video+is_animated+position+tags+%7B+category+text+url+%7D+permissions+url+thumbnails+%7B+width+height+size+url+%7D+%7D+&variables=%7B%22input%22%3A%7B%22filters%22%3A%5B%7B%22name%22%3A%22album_id%22%2C%22value%22%3A%22348936%22%7D%5D%2C%22display%22%3A%22rating_all_time%22%2C%22page%22%3A1%7D%7D"
download(url)
print('done!')

Python - Crawling images from json file

Result

Python - Crawling images from json file

 2,960 total views,  4 views today

error
fb-share-icon
Tweet
fb-share-icon
IT Team
Author: IT Team

Tags: Crawler
Last updated:2021-03-17

IT Team

This person is lazy and left nothing

Like
Next >

Comments

Cancel reply
Chinese (Simplified) Chinese (Simplified) Chinese (Traditional) Chinese (Traditional) English English French French German German Japanese Japanese Korean Korean Russian Russian
Newest Hotspots Random
Newest Hotspots Random
Rich editor not working Making web page scroll down automatically Getting data from Dapper result All Unicode Chars How to keep and display contact form 7 data Common Regular Expressions
Customize theme CityZen Accessing private fields in C# Sending email using gmail SMTP PHP 7.2 issue - count() Handling long-press event Undo product custom field value in Ultimate Product Catalog
Categories
  • Architecture
  • BI
  • C#
  • CSS
  • Database
  • DotNET
  • Hosting
  • HTML
  • JavaScript
  • PHP
  • Program Language
  • Python
  • Security
  • SEO
  • Technology
  • Web
  • Wordpress

COPYRIGHT © 2021 Hostlike IT Blog. All rights reserved.

This site is supported by Hostlike.com