release 2016.09.04

[theplatform] fix player regex(closes #10546 )
[youtube:playlist] Extend _VALID_URL
2026-01-24 00:00:10 -05:00 · 2016-09-04 20:51:48 +07:00 · 2016-09-04 14:24:41 +01:00 · 2016-09-04 20:12:34 +07:00 · 2016-09-04 11:45:29 +01:00 · 2016-09-04 11:45:29 +01:00
22 changed files with 327 additions and 147 deletions
--- a/.github/ISSUE_TEMPLATE.md
+++ b/.github/ISSUE_TEMPLATE.md
@@ -6,8 +6,8 @@

 ---

-### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.09.03*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.09.03**
+### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.09.04*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
+- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.09.04**

 ### Before submitting an *issue* make sure you have:
 - [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
@@ -35,7 +35,7 @@ $ youtube-dl -v <your command line>
 [debug] User config: []
 [debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
 [debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
-[debug] youtube-dl version 2016.09.03
+[debug] youtube-dl version 2016.09.04
 [debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
 [debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
 [debug] Proxy map: {}
--- a/14
+++ b/14
@@ -1,3 +1,17 @@
+version 2016.09.04
+
+Core
+* If the first segment of DASH fails, abort the whole download process to
+  prevent throttling (#10497)
+
+Extractors
+* [pornvoisines] Fix extraction (#10469)
+* [rottentomatoes] Fix extraction (#10467)
+* [youjizz] Fix extraction (#10437)
+ [foxnews] Add support for FoxNews Insider (#10445)
+ [fc2] Recognize Flash player URLs (#10512)
+
+
 version 2016.09.03

 Core
--- a/README.md
+++ b/README.md
@@ -89,6 +89,8 @@ which means you can modify it, redistribute it or use it however you like.
    --mark-watched                   Mark videos watched (YouTube only)
    --no-mark-watched                Do not mark videos watched (YouTube only)
    --no-color                       Do not emit color codes in output
+    --abort-on-unavailable-fragment  Abort downloading when some fragment is not
+                                     available

 ## Network Options:
    --proxy URL                      Use the specified HTTP/HTTPS/SOCKS proxy.
@@ -173,7 +175,10 @@ which means you can modify it, redistribute it or use it however you like.
    -R, --retries RETRIES            Number of retries (default is 10), or
                                     "infinite".
    --fragment-retries RETRIES       Number of retries for a fragment (default
-                                     is 10), or "infinite" (DASH only)
+                                     is 10), or "infinite" (DASH and hlsnative
+                                     only)
+    --skip-unavailable-fragments     Skip unavailable fragments (DASH and
+                                     hlsnative only)
    --buffer-size SIZE               Size of download buffer (e.g. 1024 or 16K)
                                     (default is 1024)
    --no-resize-buffer               Do not automatically adjust the buffer
--- a/docs/supportedsites.md
+++ b/docs/supportedsites.md
@@ -232,6 +232,7 @@
 - **FacebookPluginsVideo**
 - **faz.net**
 - **fc2**
+ - **fc2:embed**
 - **Fczenit**
 - **features.aol.com**
 - **fernsehkritik.tv**
@@ -245,6 +246,7 @@
 - **FOX**
 - **Foxgay**
 - **FoxNews**: Fox News and Fox Business Video
+ - **foxnews:insider**
 - **FoxSports**
 - **france2.fr:generation-quoi**
 - **FranceCulture**
--- a/youtube_dl/init.py
+++ b/youtube_dl/init.py
@@ -318,6 +318,7 @@ def _real_main(argv=None):
        'nooverwrites': opts.nooverwrites,
        'retries': opts.retries,
        'fragment_retries': opts.fragment_retries,
+        'skip_unavailable_fragments': opts.skip_unavailable_fragments,
        'buffersize': opts.buffersize,
        'noresizebuffer': opts.noresizebuffer,
        'continuedl': opts.continue_dl,
--- a/youtube_dl/downloader/dash.py
+++ b/youtube_dl/downloader/dash.py
@@ -38,8 +38,10 @@ class DashSegmentsFD(FragmentFD):
        segments_filenames = []

        fragment_retries = self.params.get('fragment_retries', 0)
+        skip_unavailable_fragments = self.params.get('skip_unavailable_fragments', True)

-        def append_url_to_file(target_url, tmp_filename, segment_name):
+        def process_segment(segment, tmp_filename, fatal):
+            target_url, segment_name = segment
            target_filename = '%s-%s' % (tmp_filename, segment_name)
            count = 0
            while count <= fragment_retries:
@@ -52,26 +54,35 @@ class DashSegmentsFD(FragmentFD):
                    down.close()
                    segments_filenames.append(target_sanitized)
                    break
-                except (compat_urllib_error.HTTPError, ) as err:
+                except compat_urllib_error.HTTPError as err:
                    # YouTube may often return 404 HTTP error for a fragment causing the
                    # whole download to fail. However if the same fragment is immediately
                    # retried with the same request data this usually succeeds (1-2 attemps
                    # is usually enough) thus allowing to download the whole file successfully.
-                    # So, we will retry all fragments that fail with 404 HTTP error for now.
-                    if err.code != 404:
-                        raise
-                    # Retry fragment
+                    # To be future-proof we will retry all fragments that fail with any
+                    # HTTP error.
                    count += 1
                    if count <= fragment_retries:
-                        self.report_retry_fragment(segment_name, count, fragment_retries)
+                        self.report_retry_fragment(err, segment_name, count, fragment_retries)
            if count > fragment_retries:
+                if not fatal:
+                    self.report_skip_fragment(segment_name)
+                    return True
                self.report_error('giving up after %s fragment retries' % fragment_retries)
                return False
+            return True

-        if initialization_url:
-            append_url_to_file(initialization_url, ctx['tmpfilename'], 'Init')
-        for i, segment_url in enumerate(segment_urls):
-            append_url_to_file(segment_url, ctx['tmpfilename'], 'Seg%d' % i)
+        segments_to_download = [(initialization_url, 'Init')] if initialization_url else []
+        segments_to_download.extend([
+            (segment_url, 'Seg%d' % i)
+            for i, segment_url in enumerate(segment_urls)])
+
+        for i, segment in enumerate(segments_to_download):
+            # In DASH, the first segment contains necessary headers to
+            # generate a valid MP4 file, so always abort for the first segment
+            fatal = i == 0 or not skip_unavailable_fragments
+            if not process_segment(segment, ctx['tmpfilename'], fatal):
+                return False

        self._finish_frag_download(ctx)

--- a/youtube_dl/downloader/fragment.py
+++ b/youtube_dl/downloader/fragment.py
@@ -6,6 +6,7 @@ import time
 from .common import FileDownloader
 from .http import HttpFD
 from ..utils import (
+    error_to_compat_str,
    encodeFilename,
    sanitize_open,
 )
@@ -22,13 +23,19 @@ class FragmentFD(FileDownloader):

    Available options:

-    fragment_retries:   Number of times to retry a fragment for HTTP error (DASH only)
+    fragment_retries:   Number of times to retry a fragment for HTTP error (DASH
+                        and hlsnative only)
+    skip_unavailable_fragments:
+                        Skip unavailable fragments (DASH and hlsnative only)
    """

-    def report_retry_fragment(self, fragment_name, count, retries):
+    def report_retry_fragment(self, err, fragment_name, count, retries):
        self.to_screen(
-            '[download] Got server HTTP error. Retrying fragment %s (attempt %d of %s)...'
-            % (fragment_name, count, self.format_retries(retries)))
+            '[download] Got server HTTP error: %s. Retrying fragment %s (attempt %d of %s)...'
+            % (error_to_compat_str(err), fragment_name, count, self.format_retries(retries)))
+
+    def report_skip_fragment(self, fragment_name):
+        self.to_screen('[download] Skipping fragment %s...' % fragment_name)

    def _prepare_and_start_frag_download(self, ctx):
        self._prepare_frag_download(ctx)
--- a/youtube_dl/downloader/hls.py
+++ b/youtube_dl/downloader/hls.py
@@ -13,6 +13,7 @@ from .fragment import FragmentFD
 from .external import FFmpegFD

 from ..compat import (
+    compat_urllib_error,
    compat_urlparse,
    compat_struct_pack,
 )
@@ -83,6 +84,10 @@ class HlsFD(FragmentFD):

        self._prepare_and_start_frag_download(ctx)

+        fragment_retries = self.params.get('fragment_retries', 0)
+        skip_unavailable_fragments = self.params.get('skip_unavailable_fragments', True)
+        test = self.params.get('test', False)
+
        extra_query = None
        extra_param_to_segment_url = info_dict.get('extra_param_to_segment_url')
        if extra_param_to_segment_url:
@@ -99,15 +104,37 @@ class HlsFD(FragmentFD):
                        line
                        if re.match(r'^https?://', line)
                        else compat_urlparse.urljoin(man_url, line))
-                    frag_filename = '%s-Frag%d' % (ctx['tmpfilename'], i)
+                    frag_name = 'Frag%d' % i
+                    frag_filename = '%s-%s' % (ctx['tmpfilename'], frag_name)
                    if extra_query:
                        frag_url = update_url_query(frag_url, extra_query)
-                    success = ctx['dl'].download(frag_filename, {'url': frag_url})
-                    if not success:
+                    count = 0
+                    while count <= fragment_retries:
+                        try:
+                            success = ctx['dl'].download(frag_filename, {'url': frag_url})
+                            if not success:
+                                return False
+                            down, frag_sanitized = sanitize_open(frag_filename, 'rb')
+                            frag_content = down.read()
+                            down.close()
+                            break
+                        except compat_urllib_error.HTTPError as err:
+                            # Unavailable (possibly temporary) fragments may be served.
+                            # First we try to retry then either skip or abort.
+                            # See https://github.com/rg3/youtube-dl/issues/10165,
+                            # https://github.com/rg3/youtube-dl/issues/10448).
+                            count += 1
+                            if count <= fragment_retries:
+                                self.report_retry_fragment(err, frag_name, count, fragment_retries)
+                    if count > fragment_retries:
+                        if skip_unavailable_fragments:
+                            i += 1
+                            media_sequence += 1
+                            self.report_skip_fragment(frag_name)
+                            continue
+                        self.report_error(
+                            'giving up after %s fragment retries' % fragment_retries)
                        return False
-                    down, frag_sanitized = sanitize_open(frag_filename, 'rb')
-                    frag_content = down.read()
-                    down.close()
                    if decrypt_info['METHOD'] == 'AES-128':
                        iv = decrypt_info.get('IV') or compat_struct_pack('>8xq', media_sequence)
                        frag_content = AES.new(
@@ -115,7 +142,7 @@ class HlsFD(FragmentFD):
                    ctx['dest_stream'].write(frag_content)
                    frags_filenames.append(frag_sanitized)
                    # We only download the first fragment during the test
-                    if self.params.get('test', False):
+                    if test:
                        break
                    i += 1
                    media_sequence += 1
--- a/youtube_dl/extractor/common.py
+++ b/youtube_dl/extractor/common.py
@@ -1163,13 +1163,6 @@ class InfoExtractor(object):
                              m3u8_id=None, note=None, errnote=None,
                              fatal=True, live=False):

-        formats = [self._m3u8_meta_format(m3u8_url, ext, preference, m3u8_id)]
-
-        format_url = lambda u: (
-            u
-            if re.match(r'^https?://', u)
-            else compat_urlparse.urljoin(m3u8_url, u))
-
        res = self._download_webpage_handle(
            m3u8_url, video_id,
            note=note or 'Downloading m3u8 information',
@@ -1180,6 +1173,13 @@ class InfoExtractor(object):
        m3u8_doc, urlh = res
        m3u8_url = urlh.geturl()

+        formats = [self._m3u8_meta_format(m3u8_url, ext, preference, m3u8_id)]
+
+        format_url = lambda u: (
+            u
+            if re.match(r'^https?://', u)
+            else compat_urlparse.urljoin(m3u8_url, u))
+
        # We should try extracting formats only from master playlists [1], i.e.
        # playlists that describe available qualities. On the other hand media
        # playlists [2] should be returned as is since they contain just the media
@@ -1749,7 +1749,7 @@ class InfoExtractor(object):
            media_attributes = extract_attributes(media_tag)
            src = media_attributes.get('src')
            if src:
-                _, formats = _media_formats(src)
+                _, formats = _media_formats(src, media_type)
                media_info['formats'].extend(formats)
            media_info['thumbnail'] = media_attributes.get('poster')
            if media_content:
--- a/youtube_dl/extractor/espn.py
+++ b/youtube_dl/extractor/espn.py
@@ -5,7 +5,7 @@ from ..utils import remove_end


 class ESPNIE(InfoExtractor):
-    _VALID_URL = r'https?://espn\.go\.com/(?:[^/]+/)*(?P<id>[^/]+)'
+    _VALID_URL = r'https?://(?:espn\.go|(?:www\.)?espn)\.com/(?:[^/]+/)*(?P<id>[^/]+)'
    _TESTS = [{
        'url': 'http://espn.go.com/video/clip?id=10365079',
        'md5': '60e5d097a523e767d06479335d1bdc58',
@@ -47,6 +47,9 @@ class ESPNIE(InfoExtractor):
    }, {
        'url': 'http://espn.go.com/nba/playoffs/2015/story/_/id/12887571/john-wall-washington-wizards-no-swelling-left-hand-wrist-game-5-return',
        'only_matching': True,
+    }, {
+        'url': 'http://www.espn.com/video/clip?id=10365079',
+        'only_matching': True,
    }]

    def _real_extract(self, url):
--- a/youtube_dl/extractor/extractors.py
+++ b/youtube_dl/extractor/extractors.py
@@ -269,7 +269,10 @@ from .facebook import (
    FacebookPluginsVideoIE,
 )
 from .faz import FazIE
-from .fc2 import FC2IE
+from .fc2 import (
+    FC2IE,
+    FC2EmbedIE,
+)
 from .fczenit import FczenitIE
 from .firstpost import FirstpostIE
 from .firsttv import FirstTVIE
@@ -284,7 +287,10 @@ from .formula1 import Formula1IE
 from .fourtube import FourTubeIE
 from .fox import FOXIE
 from .foxgay import FoxgayIE
-from .foxnews import FoxNewsIE
+from .foxnews import (
+    FoxNewsIE,
+    FoxNewsInsiderIE,
+)
 from .foxsports import FoxSportsIE
 from .franceculture import FranceCultureIE
 from .franceinter import FranceInterIE
--- a/youtube_dl/extractor/fc2.py
+++ b/youtube_dl/extractor/fc2.py
@@ -1,10 +1,12 @@
-#! -*- coding: utf-8 -*-
+# coding: utf-8
 from __future__ import unicode_literals

 import hashlib
+import re

 from .common import InfoExtractor
 from ..compat import (
+    compat_parse_qs,
    compat_urllib_request,
    compat_urlparse,
 )
@@ -16,7 +18,7 @@ from ..utils import (


 class FC2IE(InfoExtractor):
-    _VALID_URL = r'^https?://video\.fc2\.com/(?:[^/]+/)*content/(?P<id>[^/]+)'
+    _VALID_URL = r'^(?:https?://video\.fc2\.com/(?:[^/]+/)*content/|fc2:)(?P<id>[^/]+)'
    IE_NAME = 'fc2'
    _NETRC_MACHINE = 'fc2'
    _TESTS = [{
@@ -75,12 +77,17 @@ class FC2IE(InfoExtractor):
    def _real_extract(self, url):
        video_id = self._match_id(url)
        self._login()
-        webpage = self._download_webpage(url, video_id)
-        self._downloader.cookiejar.clear_session_cookies()  # must clear
-        self._login()
+        webpage = None
+        if not url.startswith('fc2:'):
+            webpage = self._download_webpage(url, video_id)
+            self._downloader.cookiejar.clear_session_cookies()  # must clear
+            self._login()

-        title = self._og_search_title(webpage)
-        thumbnail = self._og_search_thumbnail(webpage)
+        title = 'FC2 video %s' % video_id
+        thumbnail = None
+        if webpage is not None:
+            title = self._og_search_title(webpage)
+            thumbnail = self._og_search_thumbnail(webpage)
        refer = url.replace('/content/', '/a/content/') if '/a/content/' not in url else url

        mimi = hashlib.md5((video_id + '_gGddgPfeaf_gzyr').encode('utf-8')).hexdigest()
@@ -113,3 +120,41 @@ class FC2IE(InfoExtractor):
            'ext': 'flv',
            'thumbnail': thumbnail,
        }
+
+
+class FC2EmbedIE(InfoExtractor):
+    _VALID_URL = r'https?://video\.fc2\.com/flv2\.swf\?(?P<query>.+)'
+    IE_NAME = 'fc2:embed'
+
+    _TEST = {
+        'url': 'http://video.fc2.com/flv2.swf?t=201404182936758512407645&i=20130316kwishtfitaknmcgd76kjd864hso93htfjcnaogz629mcgfs6rbfk0hsycma7shkf85937cbchfygd74&i=201403223kCqB3Ez&d=2625&sj=11&lang=ja&rel=1&from=11&cmt=1&tk=TlRBM09EQTNNekU9&tl=プリズン･ブレイク%20S1-01%20マイケル%20【吹替】',
+        'md5': 'b8aae5334cb691bdb1193a88a6ab5d5a',
+        'info_dict': {
+            'id': '201403223kCqB3Ez',
+            'ext': 'flv',
+            'title': 'プリズン･ブレイク S1-01 マイケル 【吹替】',
+            'thumbnail': 're:^https?://.*\.jpg$',
+        },
+    }
+
+    def _real_extract(self, url):
+        mobj = re.match(self._VALID_URL, url)
+        query = compat_parse_qs(mobj.group('query'))
+
+        video_id = query['i'][-1]
+        title = query.get('tl', ['FC2 video %s' % video_id])[0]
+
+        sj = query.get('sj', [None])[0]
+        thumbnail = None
+        if sj:
+            # See thumbnailImagePath() in ServerConst.as of flv2.swf
+            thumbnail = 'http://video%s-thumbnail.fc2.com/up/pic/%s.jpg' % (
+                sj, '/'.join((video_id[:6], video_id[6:8], video_id[-2], video_id[-1], video_id)))
+
+        return {
+            '_type': 'url_transparent',
+            'ie_key': FC2IE.ie_key(),
+            'url': 'fc2:%s' % video_id,
+            'title': title,
+            'thumbnail': thumbnail,
+        }
--- a/youtube_dl/extractor/foxnews.py
+++ b/youtube_dl/extractor/foxnews.py
@@ -3,11 +3,12 @@ from __future__ import unicode_literals
 import re

 from .amp import AMPIE
+from .common import InfoExtractor


 class FoxNewsIE(AMPIE):
    IE_DESC = 'Fox News and Fox Business Video'
-    _VALID_URL = r'https?://(?P<host>video\.fox(?:news|business)\.com)/v/(?:video-embed\.html\?video_id=)?(?P<id>\d+)'
+    _VALID_URL = r'https?://(?P<host>video\.(?:insider\.)?fox(?:news|business)\.com)/v/(?:video-embed\.html\?video_id=)?(?P<id>\d+)'
    _TESTS = [
        {
            'url': 'http://video.foxnews.com/v/3937480/frozen-in-time/#sp=show-clips',
@@ -49,6 +50,11 @@ class FoxNewsIE(AMPIE):
            'url': 'http://video.foxbusiness.com/v/4442309889001',
            'only_matching': True,
        },
+        {
+            # From http://insider.foxnews.com/2016/08/25/univ-wisconsin-student-group-pushing-silence-certain-words
+            'url': 'http://video.insider.foxnews.com/v/video-embed.html?video_id=5099377331001&autoplay=true&share_url=http://insider.foxnews.com/2016/08/25/univ-wisconsin-student-group-pushing-silence-certain-words&share_title=Student%20Group:%20Saying%20%27Politically%20Correct,%27%20%27Trash%27%20and%20%27Lame%27%20Is%20Offensive&share=true',
+            'only_matching': True,
+        },
    ]

    def _real_extract(self, url):
@@ -58,3 +64,43 @@ class FoxNewsIE(AMPIE):
            'http://%s/v/feed/video/%s.js?template=fox' % (host, video_id))
        info['id'] = video_id
        return info
+
+
+class FoxNewsInsiderIE(InfoExtractor):
+    _VALID_URL = r'https?://insider\.foxnews\.com/([^/]+/)+(?P<id>[a-z-]+)'
+    IE_NAME = 'foxnews:insider'
+
+    _TEST = {
+        'url': 'http://insider.foxnews.com/2016/08/25/univ-wisconsin-student-group-pushing-silence-certain-words',
+        'md5': 'a10c755e582d28120c62749b4feb4c0c',
+        'info_dict': {
+            'id': '5099377331001',
+            'display_id': 'univ-wisconsin-student-group-pushing-silence-certain-words',
+            'ext': 'mp4',
+            'title': 'Student Group: Saying \'Politically Correct,\' \'Trash\' and \'Lame\' Is Offensive',
+            'description': 'Is campus censorship getting out of control?',
+            'timestamp': 1472168725,
+            'upload_date': '20160825',
+            'thumbnail': 're:^https?://.*\.jpg$',
+        },
+        'add_ie': [FoxNewsIE.ie_key()],
+    }
+
+    def _real_extract(self, url):
+        display_id = self._match_id(url)
+
+        webpage = self._download_webpage(url, display_id)
+
+        embed_url = self._html_search_meta('embedUrl', webpage, 'embed URL')
+
+        title = self._og_search_title(webpage)
+        description = self._og_search_description(webpage)
+
+        return {
+            '_type': 'url_transparent',
+            'ie_key': FoxNewsIE.ie_key(),
+            'url': embed_url,
+            'display_id': display_id,
+            'title': title,
+            'description': description,
+        }
--- a/youtube_dl/extractor/internetvideoarchive.py
+++ b/youtube_dl/extractor/internetvideoarchive.py
@@ -48,13 +48,23 @@ class InternetVideoArchiveIE(InfoExtractor):
            # There are multiple videos in the playlist whlie only the first one
            # matches the video played in browsers
            video_info = configuration['playlist'][0]
+            title = video_info['title']

            formats = []
            for source in video_info['sources']:
                file_url = source['file']
                if determine_ext(file_url) == 'm3u8':
-                    formats.extend(self._extract_m3u8_formats(
-                        file_url, video_id, ext='mp4', m3u8_id='hls'))
+                    m3u8_formats = self._extract_m3u8_formats(
+                        file_url, video_id, 'mp4', 'm3u8_native', m3u8_id='hls', fatal=False)
+                    if m3u8_formats:
+                        formats.extend(m3u8_formats)
+                        file_url = m3u8_formats[0]['url']
+                        formats.extend(self._extract_f4m_formats(
+                            file_url.replace('.m3u8', '.f4m'),
+                            video_id, f4m_id='hds', fatal=False))
+                        formats.extend(self._extract_mpd_formats(
+                            file_url.replace('.m3u8', '.mpd'),
+                            video_id, mpd_id='dash', fatal=False))
                else:
                    a_format = {
                        'url': file_url,
@@ -70,7 +80,6 @@ class InternetVideoArchiveIE(InfoExtractor):

            self._sort_formats(formats)

-            title = video_info['title']
            description = video_info.get('description')
            thumbnail = video_info.get('image')
        else:
--- a/youtube_dl/extractor/pornovoisines.py
+++ b/youtube_dl/extractor/pornovoisines.py
@@ -2,7 +2,6 @@
 from __future__ import unicode_literals

 import re
-import random

 from .common import InfoExtractor
 from ..utils import (
@@ -13,61 +12,69 @@ from ..utils import (


 class PornoVoisinesIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:www\.)?pornovoisines\.com/showvideo/(?P<id>\d+)/(?P<display_id>[^/]+)'
-
-    _VIDEO_URL_TEMPLATE = 'http://stream%d.pornovoisines.com' \
-        '/static/media/video/transcoded/%s-640x360-1000-trscded.mp4'
-
-    _SERVER_NUMBERS = (1, 2)
+    _VALID_URL = r'https?://(?:www\.)?pornovoisines\.com/videos/show/(?P<id>\d+)/(?P<display_id>[^/.]+)'

    _TEST = {
-        'url': 'http://www.pornovoisines.com/showvideo/1285/recherche-appartement/',
-        'md5': '5ac670803bc12e9e7f9f662ce64cf1d1',
+        'url': 'http://www.pornovoisines.com/videos/show/919/recherche-appartement.html',
+        'md5': '6f8aca6a058592ab49fe701c8ba8317b',
        'info_dict': {
-            'id': '1285',
+            'id': '919',
            'display_id': 'recherche-appartement',
            'ext': 'mp4',
            'title': 'Recherche appartement',
-            'description': 'md5:819ea0b785e2a04667a1a01cdc89594e',
+            'description': 'md5:fe10cb92ae2dd3ed94bb4080d11ff493',
            'thumbnail': 're:^https?://.*\.jpg$',
            'upload_date': '20140925',
            'duration': 120,
            'view_count': int,
            'average_rating': float,
-            'categories': ['Débutantes', 'Scénario', 'Sodomie'],
+            'categories': ['Débutante', 'Débutantes', 'Scénario', 'Sodomie'],
            'age_limit': 18,
+            'subtitles': {
+                'fr': [{
+                    'ext': 'vtt',
+                }]
+            },
        }
    }

-    @classmethod
-    def build_video_url(cls, num):
-        return cls._VIDEO_URL_TEMPLATE % (random.choice(cls._SERVER_NUMBERS), num)
-
    def _real_extract(self, url):
        mobj = re.match(self._VALID_URL, url)
        video_id = mobj.group('id')
        display_id = mobj.group('display_id')

+        settings_url = self._download_json(
+            'http://www.pornovoisines.com/api/video/%s/getsettingsurl/' % video_id,
+            video_id, note='Getting settings URL')['video_settings_url']
+        settings = self._download_json(settings_url, video_id)['data']
+
+        formats = []
+        for kind, data in settings['variants'].items():
+            if kind == 'HLS':
+                formats.extend(self._extract_m3u8_formats(
+                    data, video_id, ext='mp4', entry_protocol='m3u8_native', m3u8_id='hls'))
+            elif kind == 'MP4':
+                for item in data:
+                    formats.append({
+                        'url': item['url'],
+                        'height': item.get('height'),
+                        'bitrate': item.get('bitrate'),
+                    })
+        self._sort_formats(formats)
+
        webpage = self._download_webpage(url, video_id)

-        video_url = self.build_video_url(video_id)
+        title = self._og_search_title(webpage)
+        description = self._og_search_description(webpage)

-        title = self._html_search_regex(
-            r'<h1>(.+?)</h1>', webpage, 'title', flags=re.DOTALL)
-        description = self._html_search_regex(
-            r'<article id="descriptif">(.+?)</article>',
-            webpage, 'description', fatal=False, flags=re.DOTALL)
-
-        thumbnail = self._search_regex(
-            r'<div id="mediaspace%s">\s*<img src="/?([^"]+)"' % video_id,
-            webpage, 'thumbnail', fatal=False)
-        if thumbnail:
-            thumbnail = 'http://www.pornovoisines.com/%s' % thumbnail
+        # The webpage has a bug - there's no space between "thumb" and src=
+        thumbnail = self._html_search_regex(
+            r'<img[^>]+class=([\'"])thumb\1[^>]*src=([\'"])(?P<url>[^"]+)\2',
+            webpage, 'thumbnail', fatal=False, group='url')

        upload_date = unified_strdate(self._search_regex(
-            r'Publié le ([\d-]+)', webpage, 'upload date', fatal=False))
-        duration = int_or_none(self._search_regex(
-            'Durée (\d+)', webpage, 'duration', fatal=False))
+            r'Le\s*<b>([\d/]+)', webpage, 'upload date', fatal=False))
+        duration = settings.get('main', {}).get('duration')
        view_count = int_or_none(self._search_regex(
            r'(\d+) vues', webpage, 'view count', fatal=False))
        average_rating = self._search_regex(
@@ -75,15 +82,19 @@ class PornoVoisinesIE(InfoExtractor):
        if average_rating:
            average_rating = float_or_none(average_rating.replace(',', '.'))

-        categories = self._html_search_meta(
-            'keywords', webpage, 'categories', fatal=False)
+        categories = self._html_search_regex(
+            r'(?s)Catégories\s*:\s*<b>(.+?)</b>', webpage, 'categories', fatal=False)
        if categories:
            categories = [category.strip() for category in categories.split(',')]

+        subtitles = {'fr': [{
+            'url': subtitle,
+        } for subtitle in settings.get('main', {}).get('vtt_tracks', {}).values()]}
+
        return {
            'id': video_id,
            'display_id': display_id,
-            'url': video_url,
+            'formats': formats,
            'title': title,
            'description': description,
            'thumbnail': thumbnail,
@@ -93,4 +104,5 @@ class PornoVoisinesIE(InfoExtractor):
            'average_rating': average_rating,
            'categories': categories,
            'age_limit': 18,
+            'subtitles': subtitles,
        }
--- a/youtube_dl/extractor/rottentomatoes.py
+++ b/youtube_dl/extractor/rottentomatoes.py
@@ -1,7 +1,6 @@
 from __future__ import unicode_literals

 from .common import InfoExtractor
-from ..compat import compat_urlparse
 from .internetvideoarchive import InternetVideoArchiveIE


@@ -11,21 +10,23 @@ class RottenTomatoesIE(InfoExtractor):
    _TEST = {
        'url': 'http://www.rottentomatoes.com/m/toy_story_3/trailers/11028566/',
        'info_dict': {
-            'id': '613340',
+            'id': '11028566',
            'ext': 'mp4',
            'title': 'Toy Story 3',
+            'description': 'From the creators of the beloved TOY STORY films, comes a story that will reunite the gang in a whole new way.',
+            'thumbnail': 're:^https?://.*\.jpg$',
        },
    }

    def _real_extract(self, url):
        video_id = self._match_id(url)
        webpage = self._download_webpage(url, video_id)
-        og_video = self._og_search_video_url(webpage)
-        query = compat_urlparse.urlparse(og_video).query
+        iva_id = self._search_regex(r'publishedid=(\d+)', webpage, 'internet video archive id')

        return {
            '_type': 'url_transparent',
-            'url': InternetVideoArchiveIE._build_xml_url(query),
+            'url': 'http://video.internetvideoarchive.net/player/6/configuration.ashx?domain=www.videodetective.com&customerid=69249&playerid=641&publishedid=' + iva_id,
            'ie_key': InternetVideoArchiveIE.ie_key(),
+            'id': video_id,
            'title': self._og_search_title(webpage),
        }
--- a/youtube_dl/extractor/theplatform.py
+++ b/youtube_dl/extractor/theplatform.py
@@ -96,7 +96,7 @@ class ThePlatformBaseIE(OnceIE):
 class ThePlatformIE(ThePlatformBaseIE, AdobePassIE):
    _VALID_URL = r'''(?x)
        (?:https?://(?:link|player)\.theplatform\.com/[sp]/(?P<provider_id>[^/]+)/
-           (?:(?:(?:[^/]+/)+select/)?(?P<media>media/(?:guid/\d+/)?)|(?P<config>(?:[^/\?]+/(?:swf|config)|onsite)/select/))?
+           (?:(?:(?:[^/]+/)+select/)?(?P<media>media/(?:guid/\d+/)?)?|(?P<config>(?:[^/\?]+/(?:swf|config)|onsite)/select/))?
         |theplatform:)(?P<id>[^/\?&]+)'''

    _TESTS = [{
@@ -116,6 +116,7 @@ class ThePlatformIE(ThePlatformBaseIE, AdobePassIE):
            # rtmp download
            'skip_download': True,
        },
+        'skip': '404 Not Found',
    }, {
        # from http://www.cnet.com/videos/tesla-model-s-a-second-step-towards-a-cleaner-motoring-future/
        'url': 'http://link.theplatform.com/s/kYEXFC/22d_qsQ6MIRT',
--- a/youtube_dl/extractor/vimple.py
+++ b/youtube_dl/extractor/vimple.py
@@ -28,23 +28,24 @@ class SprutoBaseIE(InfoExtractor):

 class VimpleIE(SprutoBaseIE):
    IE_DESC = 'Vimple - one-click video hosting'
-    _VALID_URL = r'https?://(?:player\.vimple\.ru/iframe|vimple\.ru)/(?P<id>[\da-f-]{32,36})'
-    _TESTS = [
-        {
-            'url': 'http://vimple.ru/c0f6b1687dcd4000a97ebe70068039cf',
-            'md5': '2e750a330ed211d3fd41821c6ad9a279',
-            'info_dict': {
-                'id': 'c0f6b168-7dcd-4000-a97e-be70068039cf',
-                'ext': 'mp4',
-                'title': 'Sunset',
-                'duration': 20,
-                'thumbnail': 're:https?://.*?\.jpg',
-            },
-        }, {
-            'url': 'http://player.vimple.ru/iframe/52e1beec-1314-4a83-aeac-c61562eadbf9',
-            'only_matching': True,
-        }
-    ]
+    _VALID_URL = r'https?://(?:player\.vimple\.(?:ru|co)/iframe|vimple\.(?:ru|co))/(?P<id>[\da-f-]{32,36})'
+    _TESTS = [{
+        'url': 'http://vimple.ru/c0f6b1687dcd4000a97ebe70068039cf',
+        'md5': '2e750a330ed211d3fd41821c6ad9a279',
+        'info_dict': {
+            'id': 'c0f6b168-7dcd-4000-a97e-be70068039cf',
+            'ext': 'mp4',
+            'title': 'Sunset',
+            'duration': 20,
+            'thumbnail': 're:https?://.*?\.jpg',
+        },
+    }, {
+        'url': 'http://player.vimple.ru/iframe/52e1beec-1314-4a83-aeac-c61562eadbf9',
+        'only_matching': True,
+    }, {
+        'url': 'http://vimple.co/04506a053f124483b8fb05ed73899f19',
+        'only_matching': True,
+    }]

    def _real_extract(self, url):
        video_id = self._match_id(url)
--- a/youtube_dl/extractor/youjizz.py
+++ b/youtube_dl/extractor/youjizz.py
@@ -1,21 +1,16 @@
 from __future__ import unicode_literals

-import re
-
 from .common import InfoExtractor
-from ..utils import (
-    ExtractorError,
-)


 class YouJizzIE(InfoExtractor):
    _VALID_URL = r'https?://(?:\w+\.)?youjizz\.com/videos/(?:[^/#?]+)?-(?P<id>[0-9]+)\.html(?:$|[?#])'
    _TESTS = [{
        'url': 'http://www.youjizz.com/videos/zeichentrick-1-2189178.html',
-        'md5': '07e15fa469ba384c7693fd246905547c',
+        'md5': '78fc1901148284c69af12640e01c6310',
        'info_dict': {
            'id': '2189178',
-            'ext': 'flv',
+            'ext': 'mp4',
            'title': 'Zeichentrick 1',
            'age_limit': 18,
        }
@@ -27,38 +22,18 @@ class YouJizzIE(InfoExtractor):
    def _real_extract(self, url):
        video_id = self._match_id(url)
        webpage = self._download_webpage(url, video_id)
+        # YouJizz's HTML5 player has invalid HTML
+        webpage = webpage.replace('"controls', '" controls')
        age_limit = self._rta_search(webpage)
        video_title = self._html_search_regex(
            r'<title>\s*(.*)\s*</title>', webpage, 'title')

-        embed_page_url = self._search_regex(
-            r'(https?://www.youjizz.com/videos/embed/[0-9]+)',
-            webpage, 'embed page')
-        webpage = self._download_webpage(
-            embed_page_url, video_id, note='downloading embed page')
+        info_dict = self._parse_html5_media_entries(url, webpage, video_id)[0]

-        # Get the video URL
-        m_playlist = re.search(r'so.addVariable\("playlist", ?"(?P<playlist>.+?)"\);', webpage)
-        if m_playlist is not None:
-            playlist_url = m_playlist.group('playlist')
-            playlist_page = self._download_webpage(playlist_url, video_id,
-                                                   'Downloading playlist page')
-            m_levels = list(re.finditer(r'<level bitrate="(\d+?)" file="(.*?)"', playlist_page))
-            if len(m_levels) == 0:
-                raise ExtractorError('Unable to extract video url')
-            videos = [(int(m.group(1)), m.group(2)) for m in m_levels]
-            (_, video_url) = sorted(videos)[0]
-            video_url = video_url.replace('%252F', '%2F')
-        else:
-            video_url = self._search_regex(r'so.addVariable\("file",encodeURIComponent\("(?P<source>[^"]+)"\)\);',
-                                           webpage, 'video URL')
-
-        return {
+        info_dict.update({
            'id': video_id,
-            'url': video_url,
            'title': video_title,
-            'ext': 'flv',
-            'format': 'flv',
-            'player_url': embed_page_url,
            'age_limit': age_limit,
-        }
+        })
+
+        return info_dict
--- a/youtube_dl/extractor/youtube.py
+++ b/youtube_dl/extractor/youtube.py
@@ -264,7 +264,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
                         )
                     )?                                                       # all until now is optional -> you can pass the naked ID
                     ([0-9A-Za-z_-]{11})                                      # here is it! the YouTube video ID
-                     (?!.*?&list=)                                            # combined list/video URLs are handled by the playlist IE
+                     (?!.*?\blist=)                                            # combined list/video URLs are handled by the playlist IE
                     (?(1).+)?                                                # if we found the ID, everything can follow
                     $"""
    _NEXT_URL_RE = r'[\?&]next_url=([^&]+)'
@@ -1778,11 +1778,14 @@ class YoutubePlaylistIE(YoutubePlaylistBaseInfoExtractor):
    _VALID_URL = r"""(?x)(?:
                        (?:https?://)?
                        (?:\w+\.)?
-                        youtube\.com/
                        (?:
-                           (?:course|view_play_list|my_playlists|artist|playlist|watch|embed/videoseries)
-                           \? (?:.*?[&;])*? (?:p|a|list)=
-                        |  p/
+                            youtube\.com/
+                            (?:
+                               (?:course|view_play_list|my_playlists|artist|playlist|watch|embed/videoseries)
+                               \? (?:.*?[&;])*? (?:p|a|list)=
+                            |  p/
+                            )|
+                            youtu\.be/[0-9A-Za-z_-]{11}\?.*?\blist=
                        )
                        (
                            (?:PL|LL|EC|UU|FL|RD|UL)?[0-9A-Za-z-_]{10,}
@@ -1887,6 +1890,9 @@ class YoutubePlaylistIE(YoutubePlaylistBaseInfoExtractor):
            'skip_download': True,
        },
        'add_ie': [YoutubeIE.ie_key()],
+    }, {
+        'url': 'https://youtu.be/uWyaPkt-VOI?list=PL9D9FC436B881BA21',
+        'only_matching': True,
    }]

    def _real_initialize(self):
@@ -2376,7 +2382,7 @@ class YoutubeWatchLaterIE(YoutubePlaylistIE):
    }]

    def _real_extract(self, url):
-        video = self._check_download_just_video(url, 'WL')
+        _, video = self._check_download_just_video(url, 'WL')
        if video:
            return video
        _, playlist = self._extract_playlist('WL')
--- a/youtube_dl/options.py
+++ b/youtube_dl/options.py
@@ -423,7 +423,15 @@ def parseOpts(overrideArguments=None):
    downloader.add_option(
        '--fragment-retries',
        dest='fragment_retries', metavar='RETRIES', default=10,
-        help='Number of retries for a fragment (default is %default), or "infinite" (DASH only)')
+        help='Number of retries for a fragment (default is %default), or "infinite" (DASH and hlsnative only)')
+    downloader.add_option(
+        '--skip-unavailable-fragments',
+        action='store_true', dest='skip_unavailable_fragments', default=True,
+        help='Skip unavailable fragments (DASH and hlsnative only)')
+    general.add_option(
+        '--abort-on-unavailable-fragment',
+        action='store_false', dest='skip_unavailable_fragments',
+        help='Abort downloading when some fragment is not available')
    downloader.add_option(
        '--buffer-size',
        dest='buffersize', metavar='SIZE', default='1024',
--- a/youtube_dl/version.py
+++ b/youtube_dl/version.py
@@ -1,3 +1,3 @@
 from __future__ import unicode_literals

-__version__ = '2016.09.03'
+__version__ = '2016.09.04'
Author	SHA1	Message	Date
Sergey M․	d9606d9b6c	release 2016.09.04	2016-09-04 20:51:48 +07:00
Remita Amine	433af6ad30	[theplatform] fix player regex(closes #10546 )	2016-09-04 14:24:41 +01:00
Sergey M․	feaa5ad787	[youtube:playlist] Extend _VALID_URL	2016-09-04 20:12:34 +07:00
Remita Amine	100bd86a68	[rottentomatoes] delegate extraction to InternetVideoArchiveIE	2016-09-04 11:45:29 +01:00
Remita Amine	0def758782	[internetvideoarchive] extract all formats	2016-09-04 11:45:29 +01:00
Yen Chi Hsuan	919cf1a62f	[downloader/dash] Abort if the first segment fails Closes #10497, Closes #10542	2016-09-04 17:32:29 +08:00
Yen Chi Hsuan	b29cd56591	[pornovoisines] Fix extraction (closes #10469 )	2016-09-04 17:01:39 +08:00
Yen Chi Hsuan	622638512b	[rottentomatoes] Fix extraction Closes #10467	2016-09-04 16:25:59 +08:00
Sergey M․	37c7490ac6	[espn] Extend _VALID_URL (Closes #10549 )	2016-09-04 04:59:46 +07:00
Sergey M․	091624f9da	[vimple] Extend _VALID_URL (Closes #10547 )	2016-09-04 03:39:13 +07:00
Sergey M․	7e5dc339de	[youtube:watchlater] Fix extraction (Closes #10544 )	2016-09-04 00:29:01 +07:00
Sergey M․	4a69fa04e0	[downloader/dash] Abort download immediately after giving up on some fragment	2016-09-03 17:51:48 +07:00
Sergey M․	2e99cd30c3	[downloader/dash:hls] Report exact fragment error on retry	2016-09-03 17:51:48 +07:00
Sergey M․	25afc2a783	[downloader/dash:hls] Respect --fragment-retries and --skip-unavailable-fragments (Closes #10165 , closes #10448 )	2016-09-03 17:51:48 +07:00
Sergey M․	9603b66012	Introduce --skip-unavailable-fragments	2016-09-03 17:51:48 +07:00
Yen Chi Hsuan	45aab4d30b	[youjizz] Fix extraction. The site has moved to HTML5 Closes #10437	2016-09-03 18:37:36 +08:00
Yen Chi Hsuan	ed2bfe93aa	[fc2:embed] Add ie_key	2016-09-03 18:22:00 +08:00
Yen Chi Hsuan	cdc783510b	[foxnews:insider] Add new extractor Closes #10445	2016-09-03 18:16:19 +08:00
Yen Chi Hsuan	cf0efe9636	[fc2:embed] New extractor for Flash player URLs Closes #10512	2016-09-03 17:25:03 +08:00
Christian Pointner	dedb177029	Fix parsing of HTML5 media elements This fixes an error in _parse_html5_media_entries in case an audio or video tag directly uses a src attribute insted of <source> elements in it's body.	2016-09-03 16:09:35 +07:00