Stream: Google Code-in

Topic: python script


view this post on Zulip Jeff Sieu (Dec 10 2017 at 05:41):

For the task that requires a python script that downloads GCI data, does the data have to include task submissions like given in the example link

view this post on Zulip Jeff Sieu (Dec 11 2017 at 03:43):

@Sean Hey Sean, I have finished the knight, but I'm still on the python script task and I've not a clue as to how to download task submissions with the API. I've read the API and I only saw it returns JSONs with info about tasks and task instances, but not the files submitted to each task instance.

view this post on Zulip Jeff Sieu (Dec 11 2017 at 03:43):

Any ideas?

view this post on Zulip Sean (Dec 11 2017 at 06:21):

For the task that requires a python script that downloads GCI data, does the data have to include task submissions like given in the example link

unless there's strictly no way to get that data, but yes -- the entire point of that task is to download all of the files that have been uploaded.

view this post on Zulip Jeff Sieu (Dec 11 2017 at 06:23):

cant seem to find a way to download submission data through the api though, so i've submitted my knight task first

view this post on Zulip Sean (Dec 11 2017 at 06:39):

Any ideas?

There was support/contact information for the API, so feel free to contact them on how to get at the files submitted.

view this post on Zulip Jeff Sieu (Dec 11 2017 at 07:13):

So, I've just contacted the GCI API support and it turns out the API does not support this

view this post on Zulip Sean (Dec 11 2017 at 07:13):

did robert give any hints?

view this post on Zulip Sean (Dec 11 2017 at 07:14):

is there perhaps a link to the GCI site in the API, perhaps a suburl?

view this post on Zulip Sean (Dec 11 2017 at 07:14):

where the links could be scraped?

view this post on Zulip Sean (Dec 11 2017 at 07:14):

or the whole page for that matter, wget style

view this post on Zulip Jeff Sieu (Dec 11 2017 at 07:27):

From the past submissions it does seem like the html page was scraped

view this post on Zulip Jeff Sieu (Dec 11 2017 at 07:27):

Not much, I only got "The API does not support this.

(And it is highly unlikely to support it this year.)

-R

"

view this post on Zulip Sean (Dec 11 2017 at 07:28):

the past ones were scraped, but that was a completely different system

view this post on Zulip Sean (Dec 11 2017 at 07:29):

can you link me to the task description?

view this post on Zulip Sean (Dec 11 2017 at 07:30):

note that the task was broken into two parts -- first part is the basic instance info

view this post on Zulip Sean (Dec 11 2017 at 07:30):

second task was to download the data files (somehow)

view this post on Zulip Sean (Dec 11 2017 at 07:30):

https://codein.withgoogle.com/dashboard/tasks/4537825565343744/

view this post on Zulip Sean (Dec 11 2017 at 07:32):

I just updated the descripiton

view this post on Zulip Sean (Dec 11 2017 at 07:33):

@Jeff Sieu so yeah, I see there is task_instance_url so that could be scraped using urllib or wget: https://stackoverflow.com/questions/24346872/python-equivalent-of-a-given-wget-command

view this post on Zulip Jeff Sieu (Dec 17 2017 at 12:41):

@Sean, seems like the error is probably related to proxies though I've no idea about the specifics

view this post on Zulip Jeff Sieu (Dec 17 2017 at 12:47):

Since the error is due to max retries exceeded (default is 0), maybe we can try setting the max_retries to a high number (maybe 200) using this:
from requests.adapters import HTTPAdapter
s = requests.Session()
s.mount(url, HTTPAdapter(max_retries=200))

view this post on Zulip Jeff Sieu (Dec 17 2017 at 12:48):

Referencing https://stackoverflow.com/questions/15431044/can-i-set-max-retries-for-requests-request


Last updated: Oct 09 2024 at 00:44 UTC