Skip to content

Add fetchThreadImages#434

Merged
madsmtm merged 15 commits intofbchat-dev:masterfrom
mlodybercik:master
Jul 24, 2019
Merged

Add fetchThreadImages#434
madsmtm merged 15 commits intofbchat-dev:masterfrom
mlodybercik:master

Conversation

@mlodybercik
Copy link
Copy Markdown
Contributor

Description

I wanted to create a way of downloading all images posted sent in given thread.

Research

I stareted by watching network traffic on messenger.com while scrolling down the image history.
First query looked like this:
https://www.messenger.com/webgraphql/query/?query_id=515216185516880&variables={"id":"<uid>","first":"<number>"}.Where uid is specified user and number is how many to download, default is 12. Facebook will do more, but I am kinda scared.

If you scroll down further, it shows more and more history.
Query now looks like this:
https://www.messenger.com/webgraphql/query/?query_id=515216185516880&variables={"id":"<uid>","after":"<image_id>","first":<number>}. Previous variables are the same, but there is new thing called after, after digging around some more I found out that image_id is special variable for starting query after given image. It is called coursor in JSON.

After POST'ing it, it returns almost readable JSON, it begins with "for (;;);".

My development

I forked it and created my own crude function to handle it. I'll keep working on it.

@madsmtm
Copy link
Copy Markdown
Member

madsmtm commented Jul 2, 2019

Hi there @krzesu0, thanks for the research!

I've previously investigated Facebook's GraphQL endpoints quite extensively, and i know that the endpoint at /webgraphql/query works similarly to the one at /api/graphqlbatch/, which we've previously implemented support for, see Client.graphql_request. So I'd suggest using that, something like this should get you far:

data = {"id": <thread_id>, "first": 12}
j = self.graphql_request(_graphql.from_query_id("515216185516880", data))

Note: This internal API was changed recently in #439, so remember to update from master before implementing this!

When you want to parse the result, please make a helper @classmethod on the _file.ImageAttachment class, and convert the result to ImageAttachment objects.

Regarding the cursors, then I think it'd be nice if we could hide them as an implementation detail, and just return an iterable, then the user can choose how many results they want from there?

Comment thread fbchat/_util.py Outdated
@mlodybercik
Copy link
Copy Markdown
Contributor Author

When you want to parse the result, please make a helper @classmethod on the _file.ImageAttachment class, and convert the result to ImageAttachment objects.

Regarding the cursors, then I think it'd be nice if we could hide them as an implementation detail, and just return an iterable, then the user can choose how many results they want from there?

I rewrote this as @madsmtm suggested. I decided to hide cursor from user . Both _fetchImages (helper) and fetchThreadImages are now generator objects.

Comment thread fbchat/_util.py Outdated
Copy link
Copy Markdown
Member

@madsmtm madsmtm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, sorry I caused you so much trouble with the various merges, and thanks for the hard work!

Comment thread fbchat/_file.py
Comment thread fbchat/_client.py Outdated
Comment thread fbchat/_client.py Outdated
Comment thread fbchat/_client.py Outdated
Comment thread fbchat/_client.py
Comment thread fbchat/_client.py Outdated
@mlodybercik
Copy link
Copy Markdown
Contributor Author

I don't think this feature will work for thread videos, It returns only the preview URL and ID's. It uses these ID's to download it from /mercury/attachments/video/?video_id=.... This could be another feature I'll work on.

Copy link
Copy Markdown
Member

@madsmtm madsmtm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried it out today, and left a few comments, but otherwise good work 👍.

Regarding videos, then I think you're right, it uses another system. You should be fully welcome to implement that (in a separate PR 😉) when you want to!

Comment thread fbchat/_client.py
Comment thread fbchat/_file.py Outdated
Comment thread fbchat/_file.py Outdated
Comment thread fbchat/_client.py Outdated
mlodybercik and others added 2 commits July 24, 2019 12:58
> Use the attachment ID instead of the ID returned by the endpoint.

> Use legacy_attachment_id instead of uid. And for some attachments, such as MessageAudio (which this endpoint also return), you won't even get an id!

Co-Authored-By: Mads Marquart <madsmtm@gmail.com>
Copy link
Copy Markdown
Member

@madsmtm madsmtm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, I'll go ahead and merge this now! 🎉

@madsmtm madsmtm merged commit 700cf14 into fbchat-dev:master Jul 24, 2019
@madsmtm madsmtm changed the title WIP: fetchThreadImages or feature request Add fetchThreadImages Jul 25, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants