Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

下载问题 #10

Open
aopolin-lv opened this issue Jul 26, 2023 · 4 comments
Open

下载问题 #10

aopolin-lv opened this issue Jul 26, 2023 · 4 comments

Comments

@aopolin-lv
Copy link

aopolin-lv commented Jul 26, 2023

使用modelscope下载pretrain数据集过程中报错,如下所示:

2023-07-26 14:05:26,858 - modelscope - INFO - Loading ast index from /root/.cache/modelscope/ast_indexer
2023-07-26 14:05:27,483 - modelscope - INFO - Loading done! Current index file version is 1.7.1, with md5 1a3c80f9923ff896da3e2a4786eadd0f and a total number of 861 components indexed
2023-07-26 14:05:47,880 - modelscope - INFO - Reusing cached meta-data file: /root/.cache/modelscope/hub/datasets/modelscope/Youku-AliceMind/master/meta/8675a4d533a4241f99abcf63d2356b01
Overall progress:   0%|                                                                                                                                                                                                                                                                                 | 0/10009370 [00:00<?, ?it/s]2023-07-26 14:06:26,748 - modelscope - INFO - Reusing cached meta-data file: /root/.cache/modelscope/hub/datasets/modelscope/Youku-AliceMind/master/meta/8675a4d533a4241f99abcf63d2356b01
2023-07-26 14:07:06,106 - modelscope - ERROR - 'DataDownloadConfig' object has no attribute 'storage_options'
Overall progress:   0%|                                                                                                                                                                                                                                                                                 | 0/10009370 [00:39<?, ?it/s]
{'video_id:FILE': ['videos/pretrain/14111Y1211b-1134b18bAE55bFE7Jbb7135YE3aY54EaB14ba7CbAa1AbACB24527A.flv'], 'title': ['妈妈给宝宝听胎心,看看宝宝是怎么做的,太调皮了']}

请问如何处理?

@xiaomao19970819
Copy link

请问你成功解决这个问题了吗?

@aopolin-lv
Copy link
Author

请问你成功解决这个问题了吗?

没有

@cxry-wxr
Copy link

请问问题解决了吗

@MinliangLin
Copy link

Hi folks, this seems to be new version of datasets is not compatible with modelscope. The below code works for me:

ds = MsDataset.load(
    "Youku-AliceMind",
    namespace="modelscope",
    subset_name="caption",
    split="validation",  # Options: train, test, validation
    # download_mode=DownloadMode.FORCE_REDOWNLOAD,  # if you need to clean the cache , please use it
    use_streaming=True,
)
ds._dataset_context_config._download_config.storage_options = {}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants