release 2016.08.01

[ChangeLog] Add recent changes
[cwtv] Add support for cwtvpr.com (Closes #10196 )
2026-01-24 00:00:10 -05:00 · 2016-08-01 22:59:23 +07:00 · 2016-08-01 22:56:01 +07:00 · 2016-08-01 22:51:01 +07:00 · 2016-08-01 16:25:41 +01:00 · 2016-08-01 21:48:48 +07:00
21 changed files with 439 additions and 67 deletions
--- a/.github/ISSUE_TEMPLATE.md
+++ b/.github/ISSUE_TEMPLATE.md
@@ -6,8 +6,8 @@

 ---

-### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.07.24*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.07.24**
+### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.08.01*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
+- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.08.01**

 ### Before submitting an *issue* make sure you have:
 - [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
@@ -35,7 +35,7 @@ $ youtube-dl -v <your command line>
 [debug] User config: []
 [debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
 [debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
-[debug] youtube-dl version 2016.07.24
+[debug] youtube-dl version 2016.08.01
 [debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
 [debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
 [debug] Proxy map: {}
--- a/271
+++ b/271
@@ -0,0 +1,271 @@
+version 2016.08.01
+
+Fixed/improved extractors
+- [yandexmusic:track] Adapt to changes in track location JSON (#10193)
+- [bloomberg] Support another form of player (#10187)
+- [limelight] Skip DRM protected videos
+- [safari] Relax regular expressions for URL matching (#10202)
+- [cwtv] Add support for cwtvpr.com (#10196)
+
+version 2016.07.30
+
+Fixed/improved extractors
+- [twitch:clips] Sort formats
+- [tv2] Use m3u8_native
+- [tv2:article] Fix video detection (#10188)
+- rtve (#10076)
+- [dailymotion:playlist] Optimize download archive processing (#10180)
+
+
+version 2016.07.28
+
+Fixed/improved extractors
+- shared (#10170)
+- soundcloud (#10179)
+- twitch (#9767)
+
+
+version 2016.07.26.2
+
+Fixed/improved extractors
+- smotri
+- camdemy
+- mtv
+- comedycentral
+- cmt
+- cbc
+- mgtv
+- orf
+
+
+version 2016.07.24
+
+New extractors
+- arkena (#8682)
+- lcp (#8682)
+
+Fixed/improved extractors
+- facebook (#10151)
+- dailymail
+- telegraaf
+- dcn
+- onet
+- tvp
+
+Miscellaneous
+- Support $Time$ in DASH manifests
+
+
+version 2016.07.22
+
+New extractors
+- odatv (#9285)
+
+Fixed/improved extractors
+- bbc
+- youjizz (#10131)
+- youtube (#10140)
+- pornhub (#10138)
+- eporner (#10139)
+
+
+version 2016.07.17
+
+New extractors
+- nintendo (#9986)
+- streamable (#9122)
+
+Fixed/improved extractors
+- ard (#10095)
+- mtv
+- comedycentral (#10101)
+- viki (#10098)
+- spike (#10106)
+
+Miscellaneous
+- Improved twitter player detection (#10090)
+
+
+version 2016.07.16
+
+New extractors
+- ninenow (#5181)
+
+Fixed/improved extractors
+- rtve (#10076)
+- brightcove
+- 3qsdn
+- syfy (#9087, #3820, #2388)
+- youtube (#10083)
+
+Miscellaneous
+- Fix subtitle embedding for video-only and audio-only files (#10081)
+
+
+version 2016.07.13
+
+New extractors
+- rudo
+
+Fixed/improved extractors
+- biobiochiletv
+- tvplay
+- dbtv
+- brightcove
+- tmz
+- youtube (#10059)
+- shahid (#10062)
+- vk
+- ellentv (#10067)
+
+
+version 2016.07.11
+
+New Extractors
+- roosterteeth (#9864)
+
+Fixed/improved extractors
+- miomio (#9605)
+- vuclip
+- youtube
+- vidzi (#10058)
+
+
+version 2016.07.09.2
+
+Fixed/improved extractors
+- vimeo (#1638)
+- facebook (#10048)
+- lynda (#10047)
+- animeondemand
+
+Fixed/improved features
+- Embedding subtitles no longer throws an error with problematic inputs (#9063)
+
+
+version 2016.07.09.1
+
+Fixed/improved extractors
+- youtube
+- ard
+- srmediatek (#9373)
+
+
+version 2016.07.09
+
+New extractors
+- Flipagram (#9898)
+
+Fixed/improved extractors
+- telecinco
+- toutv
+- radiocanada
+- tweakers (#9516)
+- lynda
+- nick (#7542)
+- polskieradio (#10028)
+- le
+- facebook (#9851)
+- mgtv
+- animeondemand (#10031)
+
+Fixed/improved features
+- `--postprocessor-args` and `--downloader-args` now accepts non-ASCII inputs
+  on non-Windows systems
+
+
+version 2016.07.07
+
+New extractors
+- kamcord (#10001)
+
+Fixed/improved extractors
+- spiegel (#10018)
+- metacafe (#8539, #3253)
+- onet (#9950)
+- francetv (#9955)
+- brightcove (#9965)
+- daum (#9972)
+
+
+version 2016.07.06
+
+Fixed/improved extractors
+- youtube (#10007, #10009)
+- xuite
+- stitcher
+- spiegel
+- slideshare
+- sandia
+- rtvnh
+- prosiebensat1
+- onionstudios
+
+
+version 2016.07.05
+
+Fixed/improved extractors
+- brightcove
+- yahoo (#9995)
+- pornhub (#9997)
+- iqiyi
+- kaltura (#5557)
+- la7
+- Changed features
+- Rename --cn-verfication-proxy to --geo-verification-proxy
+Miscellaneous
+- Add script for displaying downloads statistics
+
+
+version 2016.07.03.1
+
+Fixed/improved extractors
+- theplatform
+- aenetworks
+- nationalgeographic
+- hrti (#9482)
+- facebook (#5701)
+- buzzfeed (#5701)
+- rai (#8617, #9157, #9232, #8552, #8551)
+- nationalgeographic (#9991)
+- iqiyi
+
+
+version 2016.07.03
+
+New extractors
+- hrti (#9482)
+
+Fixed/improved extractors
+- vk (#9981)
+- facebook (#9938)
+- xtube (#9953, #9961)
+
+
+version 2016.07.02
+
+New extractors
+- fusion (#9958)
+
+Fixed/improved extractors
+- twitch (#9975)
+- vine (#9970)
+- periscope (#9967)
+- pornhub (#8696)
+
+
+version 2016.07.01
+
+New extractors
+- 9c9media
+- ctvnews (#2156)
+- ctv (#4077)
+
+Fixed/Improved extractors
+- rds
+- meta (#8789)
+- pornhub (#9964)
+- sixplay (#2183)
+
+New features
+- Accept quoted strings across multiple lines (#9940)
--- a/4
+++ b/4
@@ -94,7 +94,7 @@ _EXTRACTOR_FILES != find youtube_dl/extractor -iname '*.py' -and -not -iname 'la
 youtube_dl/extractor/lazy_extractors.py: devscripts/make_lazy_extractors.py devscripts/lazy_load_template.py $(_EXTRACTOR_FILES)
 	$(PYTHON) devscripts/make_lazy_extractors.py $@

-youtube-dl.tar.gz: youtube-dl README.md README.txt youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish
+youtube-dl.tar.gz: youtube-dl README.md README.txt youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish ChangeLog
 	@tar -czf youtube-dl.tar.gz --transform "s|^|youtube-dl/|" --owner 0 --group 0 \
 		--exclude '*.DS_Store' \
 		--exclude '*.kate-swp' \
@@ -107,7 +107,7 @@ youtube-dl.tar.gz: youtube-dl README.md README.txt youtube-dl.1 youtube-dl.bash-
 		--exclude 'docs/_build' \
 		-- \
 		bin devscripts test youtube_dl docs \
-		LICENSE README.md README.txt \
+		ChangeLog LICENSE README.md README.txt \
 		Makefile MANIFEST.in youtube-dl.1 youtube-dl.bash-completion \
 		youtube-dl.zsh youtube-dl.fish setup.py \
 		youtube-dl
--- a/devscripts/release.sh
+++ b/devscripts/release.sh
@@ -71,9 +71,12 @@ fi
 /bin/echo -e "\n### Changing version in version.py..."
 sed -i "s/__version__ = '.*'/__version__ = '$version'/" youtube_dl/version.py

+/bin/echo -e "\n### Changing version in ChangeLog..."
+sed -i "s/<unreleased>/$version/" ChangeLog
+
 /bin/echo -e "\n### Committing documentation, templates and youtube_dl/version.py..."
 make README.md CONTRIBUTING.md .github/ISSUE_TEMPLATE.md supportedsites
-git add README.md CONTRIBUTING.md .github/ISSUE_TEMPLATE.md docs/supportedsites.md youtube_dl/version.py
+git add README.md CONTRIBUTING.md .github/ISSUE_TEMPLATE.md docs/supportedsites.md youtube_dl/version.py ChangeLog
 git commit $gpg_sign_commits -m "release $version"

 /bin/echo -e "\n### Now tagging, signing and pushing..."
--- a/devscripts/show-downloads-statistics.py
+++ b/devscripts/show-downloads-statistics.py
@@ -1,6 +1,7 @@
 #!/usr/bin/env python
 from __future__ import unicode_literals

+import itertools
 import json
 import os
 import re
@@ -21,21 +22,26 @@ def format_size(bytes):

 total_bytes = 0

-releases = json.loads(compat_urllib_request.urlopen(
-    'https://api.github.com/repos/rg3/youtube-dl/releases').read().decode('utf-8'))
+for page in itertools.count(1):
+    releases = json.loads(compat_urllib_request.urlopen(
+        'https://api.github.com/repos/rg3/youtube-dl/releases?page=%s' % page
+    ).read().decode('utf-8'))

-for release in releases:
-    compat_print(release['name'])
-    for asset in release['assets']:
-        asset_name = asset['name']
-        total_bytes += asset['download_count'] * asset['size']
-        if all(not re.match(p, asset_name) for p in (
-                r'^youtube-dl$',
-                r'^youtube-dl-\d{4}\.\d{2}\.\d{2}(?:\.\d+)?\.tar\.gz$',
-                r'^youtube-dl\.exe$')):
-            continue
-        compat_print(
-            ' %s size: %s downloads: %d'
-            % (asset_name, format_size(asset['size']), asset['download_count']))
+    if not releases:
+        break
+
+    for release in releases:
+        compat_print(release['name'])
+        for asset in release['assets']:
+            asset_name = asset['name']
+            total_bytes += asset['download_count'] * asset['size']
+            if all(not re.match(p, asset_name) for p in (
+                    r'^youtube-dl$',
+                    r'^youtube-dl-\d{4}\.\d{2}\.\d{2}(?:\.\d+)?\.tar\.gz$',
+                    r'^youtube-dl\.exe$')):
+                continue
+            compat_print(
+                ' %s size: %s downloads: %d'
+                % (asset_name, format_size(asset['size']), asset['download_count']))

 compat_print('total downloads traffic: %s' % format_size(total_bytes))
--- a/docs/supportedsites.md
+++ b/docs/supportedsites.md
@@ -142,7 +142,6 @@
 - **CollegeRama**
 - **ComCarCoff**
 - **ComedyCentral**
- - **ComedyCentralShows**: The Daily Show / The Colbert Report
 - **ComedyCentralTV**
 - **CondeNast**: Condé Nast media group: Allure, Architectural Digest, Ars Technica, Bon Appétit, Brides, Condé Nast, Condé Nast Traveler, Details, Epicurious, GQ, Glamour, Golf Digest, SELF, Teen Vogue, The New Yorker, Vanity Fair, Vogue, W Magazine, WIRED
 - **Coub**
@@ -401,7 +400,6 @@
 - **MSN**
 - **MTV**
 - **mtv.de**
- - **mtviggy.com**
 - **mtvservices:embedded**
 - **MuenchenTV**: münchen.tv
 - **MusicPlayOn**
@@ -441,7 +439,6 @@
 - **Newstube**
 - **NextMedia**: 蘋果日報
 - **NextMediaActionNews**: 蘋果日報 - 動新聞
- - **nextmovie.com**
 - **nfb**: National Film Board of Canada
 - **nfl.com**
 - **nhl.com**
@@ -699,6 +696,7 @@
 - **TNAFlix**
 - **TNAFlixNetworkEmbed**
 - **toggle**
+ - **Tosh**: Tosh.0
 - **tou.tv**
 - **Toypics**: Toypics user profile
 - **ToypicsUser**: Toypics user profile
--- a/youtube_dl/extractor/ard.py
+++ b/youtube_dl/extractor/ard.py
@@ -73,6 +73,7 @@ class ARDMediathekIE(InfoExtractor):
            'description': 'md5:c0c1c8048514deaed2a73b3a60eecacb',
            'duration': 3287,
        },
+        'skip': 'Video is no longer available',
    }]

    def _extract_media_info(self, media_info_url, webpage, video_id):
--- a/youtube_dl/extractor/bloomberg.py
+++ b/youtube_dl/extractor/bloomberg.py
@@ -1,3 +1,4 @@
+# coding: utf-8
 from __future__ import unicode_literals

 import re
@@ -20,6 +21,18 @@ class BloombergIE(InfoExtractor):
        'params': {
            'format': 'best[format_id^=hds]',
        },
+    }, {
+        # video ID in BPlayer(...)
+        'url': 'http://www.bloomberg.com/features/2016-hello-world-new-zealand/',
+        'info_dict': {
+            'id': '938c7e72-3f25-4ddb-8b85-a9be731baa74',
+            'ext': 'flv',
+            'title': 'Meet the Real-Life Tech Wizards of Middle Earth',
+            'description': 'Hello World, Episode 1: New Zealand’s freaky AI babies, robot exoskeletons, and a virtual you.',
+        },
+        'params': {
+            'format': 'best[format_id^=hds]',
+        },
    }, {
        'url': 'http://www.bloomberg.com/news/articles/2015-11-12/five-strange-things-that-have-been-happening-in-financial-markets',
        'only_matching': True,
@@ -33,7 +46,11 @@ class BloombergIE(InfoExtractor):
        webpage = self._download_webpage(url, name)
        video_id = self._search_regex(
            r'["\']bmmrId["\']\s*:\s*(["\'])(?P<url>.+?)\1',
-            webpage, 'id', group='url')
+            webpage, 'id', group='url', default=None)
+        if not video_id:
+            bplayer_data = self._parse_json(self._search_regex(
+                r'BPlayer\(null,\s*({[^;]+})\);', webpage, 'id'), name)
+            video_id = bplayer_data['id']
        title = re.sub(': Video$', '', self._og_search_title(webpage))

        embed_info = self._download_json(
--- a/youtube_dl/extractor/common.py
+++ b/youtube_dl/extractor/common.py
@@ -1786,7 +1786,7 @@ class InfoExtractor(object):

        any_restricted = False
        for tc in self.get_testcases(include_onlymatching=False):
-            if 'playlist' in tc:
+            if tc.get('playlist', []):
                tc = tc['playlist'][0]
            is_restricted = age_restricted(
                tc.get('info_dict', {}).get('age_limit'), age_limit)
--- a/youtube_dl/extractor/cwtv.py
+++ b/youtube_dl/extractor/cwtv.py
@@ -9,7 +9,7 @@ from ..utils import (


 class CWTVIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:www\.)?cw(?:tv|seed)\.com/(?:shows/)?(?:[^/]+/){2}\?.*\bplay=(?P<id>[a-z0-9]{8}-[a-z0-9]{4}-[a-z0-9]{4}-[a-z0-9]{4}-[a-z0-9]{12})'
+    _VALID_URL = r'https?://(?:www\.)?cw(?:tv(?:pr)?|seed)\.com/(?:shows/)?(?:[^/]+/)+[^?]*\?.*\b(?:play|watch)=(?P<id>[a-z0-9]{8}-[a-z0-9]{4}-[a-z0-9]{4}-[a-z0-9]{4}-[a-z0-9]{12})'
    _TESTS = [{
        'url': 'http://cwtv.com/shows/arrow/legends-of-yesterday/?play=6b15e985-9345-4f60-baf8-56e96be57c63',
        'info_dict': {
@@ -51,6 +51,12 @@ class CWTVIE(InfoExtractor):
    }, {
        'url': 'http://cwtv.com/thecw/chroniclesofcisco/?play=8adebe35-f447-465f-ab52-e863506ff6d6',
        'only_matching': True,
+    }, {
+        'url': 'http://cwtvpr.com/the-cw/video?watch=9eee3f60-ef4e-440b-b3b2-49428ac9c54e',
+        'only_matching': True,
+    }, {
+        'url': 'http://cwtv.com/shows/arrow/legends-of-yesterday/?watch=6b15e985-9345-4f60-baf8-56e96be57c63',
+        'only_matching': True,
    }]

    def _real_extract(self, url):
--- a/youtube_dl/extractor/dailymotion.py
+++ b/youtube_dl/extractor/dailymotion.py
@@ -331,7 +331,9 @@ class DailymotionPlaylistIE(DailymotionBaseInfoExtractor):

            for video_id in re.findall(r'data-xid="(.+?)"', webpage):
                if video_id not in video_ids:
-                    yield self.url_result('http://www.dailymotion.com/video/%s' % video_id, 'Dailymotion')
+                    yield self.url_result(
+                        'http://www.dailymotion.com/video/%s' % video_id,
+                        DailymotionIE.ie_key(), video_id)
                    video_ids.add(video_id)

            if re.search(self._MORE_PAGES_INDICATOR, webpage) is None:
--- a/youtube_dl/extractor/generic.py
+++ b/youtube_dl/extractor/generic.py
@@ -71,6 +71,7 @@ from .vessel import VesselIE
 from .kaltura import KalturaIE
 from .eagleplatform import EaglePlatformIE
 from .facebook import FacebookIE
+from .soundcloud import SoundcloudIE


 class GenericIE(InfoExtractor):
@@ -784,6 +785,15 @@ class GenericIE(InfoExtractor):
                'upload_date': '20141029',
            }
        },
+        # Soundcloud multiple embeds
+        {
+            'url': 'http://www.guitarplayer.com/lessons/1014/legato-workout-one-hour-to-more-fluid-performance---tab/52809',
+            'info_dict': {
+                'id': '52809',
+                'title': 'Guitar Essentials: Legato Workout—One-Hour to Fluid Performance  | TAB + AUDIO',
+            },
+            'playlist_mincount': 7,
+        },
        # Livestream embed
        {
            'url': 'http://www.esa.int/Our_Activities/Space_Science/Rosetta/Philae_comet_touch-down_webcast',
@@ -1999,12 +2009,9 @@ class GenericIE(InfoExtractor):
            return self.url_result(myvi_url)

        # Look for embedded soundcloud player
-        mobj = re.search(
-            r'<iframe\s+(?:[a-zA-Z0-9_-]+="[^"]+"\s+)*src="(?P<url>https?://(?:w\.)?soundcloud\.com/player[^"]+)"',
-            webpage)
-        if mobj is not None:
-            url = unescapeHTML(mobj.group('url'))
-            return self.url_result(url)
+        soundcloud_urls = SoundcloudIE._extract_urls(webpage)
+        if soundcloud_urls:
+            return _playlist_from_matches(soundcloud_urls, getter=unescapeHTML, ie=SoundcloudIE.ie_key())

        # Look for embedded mtvservices player
        mtvservices_url = MTVServicesEmbeddedIE._extract_url(webpage)
--- a/youtube_dl/extractor/limelight.py
+++ b/youtube_dl/extractor/limelight.py
@@ -37,11 +37,12 @@ class LimelightBaseIE(InfoExtractor):

        for stream in streams:
            stream_url = stream.get('url')
-            if not stream_url:
+            if not stream_url or stream.get('drmProtected'):
                continue
-            if '.f4m' in stream_url:
+            ext = determine_ext(stream_url)
+            if ext == 'f4m':
                formats.extend(self._extract_f4m_formats(
-                    stream_url, video_id, fatal=False))
+                    stream_url, video_id, f4m_id='hds', fatal=False))
            else:
                fmt = {
                    'url': stream_url,
@@ -50,7 +51,7 @@ class LimelightBaseIE(InfoExtractor):
                    'fps': float_or_none(stream.get('videoFrameRate')),
                    'width': int_or_none(stream.get('videoWidthInPixels')),
                    'height': int_or_none(stream.get('videoHeightInPixels')),
-                    'ext': determine_ext(stream_url)
+                    'ext': ext,
                }
                rtmp = re.search(r'^(?P<url>rtmpe?://[^/]+/(?P<app>.+))/(?P<playpath>mp4:.+)$', stream_url)
                if rtmp:
@@ -68,18 +69,23 @@ class LimelightBaseIE(InfoExtractor):

        for mobile_url in mobile_urls:
            media_url = mobile_url.get('mobileUrl')
-            if not media_url:
-                continue
            format_id = mobile_url.get('targetMediaPlatform')
-            if determine_ext(media_url) == 'm3u8':
+            if not media_url or format_id == 'Widevine':
+                continue
+            ext = determine_ext(media_url)
+            if ext == 'm3u8':
                formats.extend(self._extract_m3u8_formats(
                    media_url, video_id, 'mp4', 'm3u8_native',
                    m3u8_id=format_id, fatal=False))
+            elif ext == 'f4m':
+                formats.extend(self._extract_f4m_formats(
+                    stream_url, video_id, f4m_id=format_id, fatal=False))
            else:
                formats.append({
                    'url': media_url,
                    'format_id': format_id,
                    'preference': -1,
+                    'ext': ext,
                })

        self._sort_formats(formats)
@@ -145,7 +151,7 @@ class LimelightMediaIE(LimelightBaseIE):
        'url': 'http://link.videoplatform.limelight.com/media/?mediaId=3ffd040b522b4485b6d84effc750cd86',
        'info_dict': {
            'id': '3ffd040b522b4485b6d84effc750cd86',
-            'ext': 'flv',
+            'ext': 'mp4',
            'title': 'HaP and the HB Prince Trailer',
            'description': 'md5:8005b944181778e313d95c1237ddb640',
            'thumbnail': 're:^https?://.*\.jpeg$',
@@ -154,7 +160,7 @@ class LimelightMediaIE(LimelightBaseIE):
            'upload_date': '20090604',
        },
        'params': {
-            # rtmp download
+            # m3u8 download
            'skip_download': True,
        },
    }, {
@@ -164,7 +170,6 @@ class LimelightMediaIE(LimelightBaseIE):
            'id': 'a3e00274d4564ec4a9b29b9466432335',
            'ext': 'flv',
            'title': '3Play Media Overview Video',
-            'description': '',
            'thumbnail': 're:^https?://.*\.jpeg$',
            'duration': 78.101,
            'timestamp': 1338929955,
--- a/youtube_dl/extractor/rtve.py
+++ b/youtube_dl/extractor/rtve.py
@@ -113,6 +113,8 @@ class RTVEALaCartaIE(InfoExtractor):
        png = self._download_webpage(png_request, video_id, 'Downloading url information')
        video_url = _decrypt_url(png)
        if not video_url.endswith('.f4m'):
+            if '?' not in video_url:
+                video_url = video_url.replace('resources/', 'auth/resources/')
            video_url = video_url.replace('.net.rtve', '.multimedia.cdn.rtve')

        subtitles = None
--- a/youtube_dl/extractor/safari.py
+++ b/youtube_dl/extractor/safari.py
@@ -75,7 +75,7 @@ class SafariBaseIE(InfoExtractor):
 class SafariIE(SafariBaseIE):
    IE_NAME = 'safari'
    IE_DESC = 'safaribooksonline.com online video'
-    _VALID_URL = r'https?://(?:www\.)?safaribooksonline\.com/library/view/[^/]+/(?P<course_id>[^/]+)/(?P<part>part\d+)\.html'
+    _VALID_URL = r'https?://(?:www\.)?safaribooksonline\.com/library/view/[^/]+/(?P<course_id>[^/]+)/(?P<part>[^/?#&]+)\.html'

    _TESTS = [{
        'url': 'https://www.safaribooksonline.com/library/view/hadoop-fundamentals-livelessons/9780133392838/part00.html',
@@ -92,6 +92,9 @@ class SafariIE(SafariBaseIE):
        # non-digits in course id
        'url': 'https://www.safaribooksonline.com/library/view/create-a-nodejs/100000006A0210/part00.html',
        'only_matching': True,
+    }, {
+        'url': 'https://www.safaribooksonline.com/library/view/learning-path-red/9780134664057/RHCE_Introduction.html',
+        'only_matching': True,
    }]

    def _real_extract(self, url):
@@ -132,12 +135,15 @@ class SafariIE(SafariBaseIE):

 class SafariApiIE(SafariBaseIE):
    IE_NAME = 'safari:api'
-    _VALID_URL = r'https?://(?:www\.)?safaribooksonline\.com/api/v1/book/(?P<course_id>[^/]+)/chapter(?:-content)?/(?P<part>part\d+)\.html'
+    _VALID_URL = r'https?://(?:www\.)?safaribooksonline\.com/api/v1/book/(?P<course_id>[^/]+)/chapter(?:-content)?/(?P<part>[^/?#&]+)\.html'

-    _TEST = {
+    _TESTS = [{
        'url': 'https://www.safaribooksonline.com/api/v1/book/9780133392838/chapter/part00.html',
        'only_matching': True,
-    }
+    }, {
+        'url': 'https://www.safaribooksonline.com/api/v1/book/9780134664057/chapter/RHCE_Introduction.html',
+        'only_matching': True,
+    }]

    def _real_extract(self, url):
        mobj = re.match(self._VALID_URL, url)
--- a/youtube_dl/extractor/shared.py
+++ b/youtube_dl/extractor/shared.py
@@ -6,7 +6,6 @@ from .common import InfoExtractor
 from ..utils import (
    ExtractorError,
    int_or_none,
-    sanitized_Request,
    urlencode_postdata,
 )

@@ -37,28 +36,33 @@ class SharedIE(InfoExtractor):

    def _real_extract(self, url):
        video_id = self._match_id(url)
-        webpage = self._download_webpage(url, video_id)
+
+        webpage, urlh = self._download_webpage_handle(url, video_id)

        if '>File does not exist<' in webpage:
            raise ExtractorError(
                'Video %s does not exist' % video_id, expected=True)

        download_form = self._hidden_inputs(webpage)
-        request = sanitized_Request(
-            url, urlencode_postdata(download_form))
-        request.add_header('Content-Type', 'application/x-www-form-urlencoded')

        video_page = self._download_webpage(
-            request, video_id, 'Downloading video page')
+            urlh.geturl(), video_id, 'Downloading video page',
+            data=urlencode_postdata(download_form),
+            headers={
+                'Content-Type': 'application/x-www-form-urlencoded',
+                'Referer': urlh.geturl(),
+            })

        video_url = self._html_search_regex(
-            r'data-url="([^"]+)"', video_page, 'video URL')
+            r'data-url=(["\'])(?P<url>(?:(?!\1).)+)\1',
+            video_page, 'video URL', group='url')
        title = base64.b64decode(self._html_search_meta(
            'full:title', webpage, 'title').encode('utf-8')).decode('utf-8')
        filesize = int_or_none(self._html_search_meta(
            'full:size', webpage, 'file size', fatal=False))
        thumbnail = self._html_search_regex(
-            r'data-poster="([^"]+)"', video_page, 'thumbnail', default=None)
+            r'data-poster=(["\'])(?P<url>(?:(?!\1).)+)\1',
+            video_page, 'thumbnail', default=None, group='url')

        return {
            'id': video_id,
--- a/youtube_dl/extractor/soundcloud.py
+++ b/youtube_dl/extractor/soundcloud.py
@@ -119,6 +119,12 @@ class SoundcloudIE(InfoExtractor):
    _CLIENT_ID = '02gUJC0hH2ct1EGOcYXQIzRFU91c72Ea'
    _IPHONE_CLIENT_ID = '376f225bf427445fc4bfb6b99b72e0bf'

+    @staticmethod
+    def _extract_urls(webpage):
+        return [m.group('url') for m in re.finditer(
+            r'<iframe[^>]+src=(["\'])(?P<url>(?:https?://)?(?:w\.)?soundcloud\.com/player.+?)\1',
+            webpage)]
+
    def report_resolve(self, video_id):
        """Report information extraction."""
        self.to_screen('%s: Resolving id' % video_id)
--- a/youtube_dl/extractor/tv2.py
+++ b/youtube_dl/extractor/tv2.py
@@ -8,6 +8,7 @@ from ..utils import (
    determine_ext,
    int_or_none,
    float_or_none,
+    js_to_json,
    parse_iso8601,
    remove_end,
 )
@@ -54,10 +55,11 @@ class TV2IE(InfoExtractor):
                ext = determine_ext(video_url)
                if ext == 'f4m':
                    formats.extend(self._extract_f4m_formats(
-                        video_url, video_id, f4m_id=format_id))
+                        video_url, video_id, f4m_id=format_id, fatal=False))
                elif ext == 'm3u8':
                    formats.extend(self._extract_m3u8_formats(
-                        video_url, video_id, 'mp4', m3u8_id=format_id))
+                        video_url, video_id, 'mp4', entry_protocol='m3u8_native',
+                        m3u8_id=format_id, fatal=False))
                elif ext == 'ism' or video_url.endswith('.ism/Manifest'):
                    pass
                else:
@@ -105,7 +107,7 @@ class TV2ArticleIE(InfoExtractor):
        'url': 'http://www.tv2.no/2015/05/16/nyheter/alesund/krim/pingvin/6930542',
        'info_dict': {
            'id': '6930542',
-            'title': 'Russen hetses etter pingvintyveri – innrømmer å ha åpnet luken på buret',
+            'title': 'Russen hetses etter pingvintyveri - innrømmer å ha åpnet luken på buret',
            'description': 'md5:339573779d3eea3542ffe12006190954',
        },
        'playlist_count': 2,
@@ -119,9 +121,23 @@ class TV2ArticleIE(InfoExtractor):

        webpage = self._download_webpage(url, playlist_id)

+        # Old embed pattern (looks unused nowadays)
+        assets = re.findall(r'data-assetid=["\'](\d+)', webpage)
+
+        if not assets:
+            # New embed pattern
+            for v in re.findall('TV2ContentboxVideo\(({.+?})\)', webpage):
+                video = self._parse_json(
+                    v, playlist_id, transform_source=js_to_json, fatal=False)
+                if not video:
+                    continue
+                asset = video.get('assetId')
+                if asset:
+                    assets.append(asset)
+
        entries = [
-            self.url_result('http://www.tv2.no/v/%s' % video_id, 'TV2')
-            for video_id in re.findall(r'data-assetid="(\d+)"', webpage)]
+            self.url_result('http://www.tv2.no/v/%s' % asset_id, 'TV2')
+            for asset_id in assets]

        title = remove_end(self._og_search_title(webpage), ' - TV2.no')
        description = remove_end(self._og_search_description(webpage), ' - TV2.no')
--- a/youtube_dl/extractor/twitch.py
+++ b/youtube_dl/extractor/twitch.py
@@ -461,7 +461,7 @@ class TwitchClipsIE(InfoExtractor):
    IE_NAME = 'twitch:clips'
    _VALID_URL = r'https?://clips\.twitch\.tv/(?:[^/]+/)*(?P<id>[^/?#&]+)'

-    _TEST = {
+    _TESTS = [{
        'url': 'https://clips.twitch.tv/ea/AggressiveCobraPoooound',
        'md5': '761769e1eafce0ffebfb4089cb3847cd',
        'info_dict': {
@@ -473,7 +473,11 @@ class TwitchClipsIE(InfoExtractor):
            'uploader': 'stereotype_',
            'uploader_id': 'stereotype_',
        },
-    }
+    }, {
+        # multiple formats
+        'url': 'https://clips.twitch.tv/rflegendary/UninterestedBeeDAESuppy',
+        'only_matching': True,
+    }]

    def _real_extract(self, url):
        video_id = self._match_id(url)
@@ -485,15 +489,27 @@ class TwitchClipsIE(InfoExtractor):
                r'(?s)clipInfo\s*=\s*({.+?});', webpage, 'clip info'),
            video_id, transform_source=js_to_json)

-        video_url = clip['clip_video_url']
-        title = clip['channel_title']
+        title = clip.get('channel_title') or self._og_search_title(webpage)
+
+        formats = [{
+            'url': option['source'],
+            'format_id': option.get('quality'),
+            'height': int_or_none(option.get('quality')),
+        } for option in clip.get('quality_options', []) if option.get('source')]
+
+        if not formats:
+            formats = [{
+                'url': clip['clip_video_url'],
+            }]
+
+        self._sort_formats(formats)

        return {
            'id': video_id,
-            'url': video_url,
            'title': title,
            'thumbnail': self._og_search_thumbnail(webpage),
            'creator': clip.get('broadcaster_display_name') or clip.get('broadcaster_login'),
            'uploader': clip.get('curator_login'),
            'uploader_id': clip.get('curator_display_name'),
+            'formats': formats,
        }
--- a/youtube_dl/extractor/yandexmusic.py
+++ b/youtube_dl/extractor/yandexmusic.py
@@ -75,6 +75,12 @@ class YandexMusicTrackIE(YandexMusicBaseIE):
            % storage_dir,
            track_id, 'Downloading track location JSON')

+        # Each string is now wrapped in a list, this is probably only temporarily thus
+        # supporting both scenarios (see https://github.com/rg3/youtube-dl/issues/10193)
+        for k, v in data.items():
+            if v and isinstance(v, list):
+                data[k] = v[0]
+
        key = hashlib.md5(('XGRlBW9FXlekgbPrRHuSiA' + data['path'][1:] + data['s']).encode('utf-8')).hexdigest()
        storage = storage_dir.split('.')

--- a/youtube_dl/version.py
+++ b/youtube_dl/version.py
@@ -1,3 +1,3 @@
 from __future__ import unicode_literals

-__version__ = '2016.07.24'
+__version__ = '2016.08.01'
Author	SHA1	Message	Date
Sergey M․	45408eb075	release 2016.08.01	2016-08-01 22:59:23 +07:00
Sergey M․	eafc66855d	[ChangeLog] Add recent changes	2016-08-01 22:56:01 +07:00
Sergey M․	e03d3e6453	[cwtv] Add support for cwtvpr.com (Closes #10196 )	2016-08-01 22:51:01 +07:00
Remita Amine	a70e45f80a	[limelight] keep videos marked as previewStream `e382b953f0 (commitcomment-18472915)`	2016-08-01 16:25:41 +01:00
Sergey M․	697655a7c0	[safari] Relax url regexes (Closes #10202 )	2016-08-01 21:48:48 +07:00
Remita Amine	e382b953f0	[limelight] skip preview and drm protected videos	2016-08-01 00:33:30 +01:00
Yen Chi Hsuan	116e7e0d04	[bloomberg] Support BPlayer() players (closes #10187 )	2016-07-31 14:47:19 +08:00
Sergey M․	cf03e34ad3	[yandexmusic:track] Fix extraction (Closes #10193 )	2016-07-31 07:56:18 +07:00
Sergey M․	2903137292	release 2016.07.30	2016-07-30 14:45:07 +07:00
Sergey M․	9361f2169c	[ChangeLog] Make extractor improvements' descriptions more concrete	2016-07-30 14:43:28 +07:00
Yen Chi Hsuan	35aa6c538f	Add ChangeLog	2016-07-30 12:33:09 +08:00
Sergey M․	fa9f1d16b8	[dailymotion:playlist] Carry long line	2016-07-29 22:47:34 +07:00
Dave	485fedf6fd	[dailymotion:playlist] Optimize download archive processing	2016-07-29 22:45:41 +07:00
Jaime Marquínez Ferrándiz	da0baba5c8	[rtve] Fix extraction for some videos For example http://www.rtve.es/alacarta/videos/documentos-tv/documentos-tv-descredito/3574098/.	2016-07-29 17:20:27 +02:00
Jaime Marquínez Ferrándiz	bb9f3bfedf	Revert "[rtve] Fix extraction (#10076 )" This reverts commit `c39b2ed990`. Apparently outside of Spain using 'auth/resources' is required (#10097).	2016-07-29 17:14:04 +02:00
Sergey M․	dbc0b39b91	[tv2] Improve extraction	2016-07-29 22:01:34 +07:00
Sergey M․	481c5c5137	[tv2:article] Fix extraction (Closes #10188 )	2016-07-29 21:43:17 +07:00
Sergey M․	0cacae2807	[twitch:clips] Sort formats	2016-07-29 09:01:53 +07:00
Sergey M․	d9d56deadf	release 2016.07.28	2016-07-28 02:42:57 +07:00
Sergey M․	74ba450a81	[twitch:clips] Fix extraction (Closes #9767 )	2016-07-28 22:30:09 +07:00
Sergey M․	db19df6ca0	[extractor/generic] Add test for #10179	2016-07-28 22:20:08 +07:00
Sergey M․	fbdf8d15d1	[soundcloud] Add _extract_urls (#10179 )	2016-07-28 22:16:05 +07:00
Sergey M․	94aae01548	[extractor/generic] Extract all soundcloud embeds (Closes #10179 )	2016-07-28 22:15:15 +07:00
Sergey M․	39eef54cf0	[ard:mediathek] Skip unavailable test	2016-07-28 21:38:23 +07:00
Sergey M․	05c8268c81	[shared] Modernize and make more robust	2016-07-27 23:39:02 +07:00
Sergey M․	289a16b4f3	[shared] Respect redirect URL (Closes #10170 )	2016-07-27 23:28:01 +07:00
Sergey M․	7935926baa	[devscripts/show-downloads-statistics] Add support for paging	2016-07-27 00:14:40 +07:00
Sergey M․	dcbb07c35a	release 2016.07.26.2	2016-07-26 23:56:53 +07:00
Sergey M․	40090e8d51	[extractor/common] Improve is_suitable In order to fix breakage introduced by `a3aa814b77`	2016-07-26 23:54:06 +07:00