Получение 'alt' из HTML-кода

0

У меня есть строка, содержащая следующее:

run:
<div class="rc_release_list_item_picture">
 <img width="87" height="87" border="0" src="http://b.image.web.tb-group.fm/www/icon/kein_release_vorhanden.png" alt="Miley Cyrus - Wrecking Ball" title="Miley Cyrus - Wrecking Ball" />
</div>
<div class="rc_release_list_item_picture">
 <a href="/release/blasterjaxx/22557/fifteen/"><img width="87" height="87" border="0" src="http://o.image.weareone.fm/news/_newsgrafiken/2013/_releases/peeeya/09-09-2013--fifteen_s.png" alt="BlasterJaxx - Fifteen" title="BlasterJaxx - Fifteen" /></a>
</div>
<div class="rc_release_list_item_picture">
 <a href="/release/alex-gaudino-feat-crystal-waters/7866/destination-calabria/"><img width="87" height="87" border="0" src="http://n.image.weareone.fm/news/_newsgrafiken/2010/_releases/naxwell/18-10-2010--wa_s.png" alt="Alex Gaudino feat. Crystal Waters - Destination Calabria" title="Alex Gaudino feat. Crystal Waters - Destination Calabria" /></a>
</div>
<div class="rc_release_list_item_picture">
 <img width="87" height="87" border="0" src="http://b.image.web.tb-group.fm/www/icon/kein_release_vorhanden.png" alt="Don Rimini - Let Me Back Up" title="Don Rimini - Let Me Back Up" />
</div>
<div class="rc_release_list_item_picture">
 <img width="87" height="87" border="0" src="http://b.image.web.tb-group.fm/www/icon/kein_release_vorhanden.png" alt="Sebjak &amp; Mike Hawkins - Let Go" title="Sebjak &amp; Mike Hawkins - Let Go" />
</div>
<div class="rc_release_list_item_picture">
 <a href="/release/deichkind/15426/leider-geil/"><img width="87" height="87" border="0" src="http://n.image.weareone.fm/news/_newsgrafiken/2012/_releases/manoua/06-09-2012--leider-geil_s.png" alt="Deichkind - Leider Geil" title="Deichkind - Leider Geil" /></a>
</div>
<div class="rc_release_list_item_picture">
 <img width="87" height="87" border="0" src="http://b.image.web.tb-group.fm/www/icon/kein_release_vorhanden.png" alt="Justice - D.A.N.C.E." title="Justice - D.A.N.C.E." />
</div>
<div class="rc_release_list_item_picture">
 <a href="/release/fatboy-slim-and-riva-starr/21683/eat-sleep-rave-repeat/"><img width="87" height="87" border="0" src="http://n.image.weareone.fm/news/_newsgrafiken/2013/_releases/lames/24-07-2013--fatboy-slim-and-riva-starr-eat-sleep-rave-repeat_s.png" alt="Fatboy Slim and Riva Starr - Eat, Sleep, Rave, Repeat" title="Fatboy Slim and Riva Starr - Eat, Sleep, Rave, Repeat" /></a>
</div>
<div class="rc_release_list_item_picture">
 <img width="87" height="87" border="0" src="http://b.image.web.tb-group.fm/www/icon/kein_release_vorhanden.png" alt="Chardy &amp; Kronic - Kavorka" title="Chardy &amp; Kronic - Kavorka" />
</div>
<div class="rc_release_list_item_picture">
 <a href="/release/kernkraft-400/5442/zombie-nation/"><img width="87" height="87" border="0" src="http://n.image.weareone.fm/news/_newsgrafiken/2009/_releases/fraggy/13-12-2009--kernkraft-400-zombie-nation_s.png" alt="Kernkraft 400 - Zombie Nation" title="Kernkraft 400 - Zombie Nation" /></a>
</div>
<div class="rc_release_list_item_picture">
 <img width="87" height="87" border="0" src="http://b.image.web.tb-group.fm/www/icon/kein_release_vorhanden.png" alt="Red Hot Chilli Peppers - Californication" title="Red Hot Chilli Peppers - Californication" />
</div>
<div class="rc_release_list_item_picture">
 <img width="87" height="87" border="0" src="http://b.image.web.tb-group.fm/www/icon/kein_release_vorhanden.png" alt="John Dahlback feat. Urban Cone &amp; Lucas Nord - We Were Gods" title="John Dahlback feat. Urban Cone &amp; Lucas Nord - We Were Gods" />
</div>
<div class="rc_release_list_item_picture">
 <a href="/release/robin-s/4681/show-me-love/"><img width="87" height="87" border="0" src="http://n.image.weareone.fm/news/_w0nd3r/2009/september/04-09-2009--robins_show_me_love_klein.jpg" alt="Robin S. - Show Me Love" title="Robin S. - Show Me Love" /></a>
</div>
<div class="rc_release_list_item_picture">
 <a href="/release/dada-life/17346/feed-the-dada/"><img width="87" height="87" border="0" src="http://o.image.weareone.fm/news/_newsgrafiken/2012/_releases/vesper/12-09-2012--dada-life-feed-the-dada_s.png" alt="Dada Life - Feed The Dada" title="Dada Life - Feed The Dada" /></a>
</div>
<div class="rc_release_list_item_picture">
 <img width="87" height="87" border="0" src="http://b.image.web.tb-group.fm/www/icon/kein_release_vorhanden.png" alt="Hard Rock Sofa &amp; Swanky Tunes - Stop In My Mind" title="Hard Rock Sofa &amp; Swanky Tunes - Stop In My Mind" />
</div>
<div class="rc_release_list_item_picture">
 <a href="/release/youngblood-hawke/19360/we-come-running/"><img width="87" height="87" border="0" src="http://n.image.weareone.fm/news/_newsgrafiken/2013/_releases/manoua/16-02-2013--we-come-running_s.png" alt="Youngblood Hawke - We Come Running" title="Youngblood Hawke - We Come Running" /></a>
</div>
<div class="rc_release_list_item_picture">
 <img width="87" height="87" border="0" src="http://b.image.web.tb-group.fm/www/icon/kein_release_vorhanden.png" alt="Avicii - Dear Boy" title="Avicii - Dear Boy" />
</div>
<div class="rc_release_list_item_picture">
 <a href="/release/an21-max-vangeli-vs-tiesto-feat-lover-lover/16146/people-of-the-night/"><img width="87" height="87" border="0" src="http://n.image.weareone.fm/news/_newsgrafiken/2012/_releases/manoua/18-11-2012--people-of-the-night_s.png" alt="AN21 &amp; Max Vangeli vs. Ti&euml;sto feat. Lover Lover - People Of The Night" title="AN21 &amp; Max Vangeli vs. Ti&euml;sto feat. Lover Lover - People Of The Night" /></a>
</div>
<div class="rc_release_list_item_picture">
 <img width="87" height="87" border="0" src="http://b.image.web.tb-group.fm/www/icon/kein_release_vorhanden.png" alt="Empire Of The Sun - DNA" title="Empire Of The Sun - DNA" />
</div>
<div class="rc_release_list_item_picture">
 <a href="/release/sick-individuals-axwell-feat-taylr-renee/22648/i-am/"><img width="87" height="87" border="0" src="http://o.image.weareone.fm/news/_newsgrafiken/2013/_releases/vesper/16-09-2013--sick-individuals-axwell-taylr-renee-i-am_s.png" alt="Sick Individuals &amp; Axwell feat. Taylr Renee - I Am" title="Sick Individuals &amp; Axwell feat. Taylr Renee - I Am" /></a>
</div>
<div class="rc_release_list_item_picture">
 <img width="87" height="87" border="0" src="http://b.image.web.tb-group.fm/www/icon/kein_release_vorhanden.png" alt="Krewella - Live For The Night" title="Krewella - Live For The Night" />
</div>
<div class="rc_release_list_item_picture">
 <img width="87" height="87" border="0" src="http://b.image.web.tb-group.fm/www/icon/kein_release_vorhanden.png" alt="Arston - Zodiac" title="Arston - Zodiac" />
</div>
<div class="rc_release_list_item_picture">
 <a href="/release/hard-rock-sofa/19764/rasputin/"><img width="87" height="87" border="0" src="http://n.image.weareone.fm/news/_newsgrafiken/2013/_releases/vesper/12-03-2013--hard-rock-sofa-rasputin_s.png" alt="Hard Rock Sofa - Rasputin" title="Hard Rock Sofa - Rasputin" /></a>
</div>
<div class="rc_release_list_item_picture">
 <img width="87" height="87" border="0" src="http://b.image.web.tb-group.fm/www/icon/kein_release_vorhanden.png" alt="Deorro - Crank It Up" title="Deorro - Crank It Up" />
</div>
<div class="rc_release_list_item_picture">
 <a href="/release/will-sparks/20264/the-viking/"><img width="87" height="87" border="0" src="http://o.image.weareone.fm/news/_newsgrafiken/2013/_releases/moony/20-04-2013--will-sparks-the-viking_s.png" alt="Will Sparks - The Viking" title="Will Sparks - The Viking" /></a>
</div>
<div class="rc_release_list_item_picture">
 <a href="/release/coldplay/13510/paradise/"><img width="87" height="87" border="0" src="http://n.image.weareone.fm/news/_newsgrafiken/2011/_releases/beatstop/15-11-2011--coldplay-paradise_s.png" alt="Coldplay  - Paradise" title="Coldplay  - Paradise" /></a>
</div>
<div class="rc_release_list_item_picture">
 <a href="/release/afrojack-d-wayne-bobby-burns/12386/no-beef-bridge/"><img width="87" height="87" border="0" src="http://n.image.weareone.fm/news/_newsgrafiken/2011/_releases/tonninski/17-08-2011--afroo_s.png" alt="Afrojack - No Beef" title="Afrojack - No Beef" /></a>
</div>
<div class="rc_release_list_item_picture">
 <a href="/release/usher/15677/scream/"><img width="87" height="87" border="0" src="http://m.image.weareone.fm/news/_newsgrafiken/2012/_releases/manoua/05-09-2012--scream_s.png" alt="Usher - Scream" title="Usher - Scream" /></a>
</div>
<div class="rc_release_list_item_picture">
 <img width="87" height="87" border="0" src="http://b.image.web.tb-group.fm/www/icon/kein_release_vorhanden.png" alt="Showtek &amp; Sonny Wilson feat. We Are Loud - Booyah" title="Showtek &amp; Sonny Wilson feat. We Are Loud - Booyah" />
</div>
<div class="rc_release_list_item_picture">
 <img width="87" height="87" border="0" src="http://b.image.web.tb-group.fm/www/icon/kein_release_vorhanden.png" alt="Steve Aoki &amp; Chris Lake &amp; Tujamo - Boneless" title="Steve Aoki &amp; Chris Lake &amp; Tujamo - Boneless" />
</div>
<div class="rc_release_list_item_picture">
 <img width="87" height="87" border="0" src="http://b.image.web.tb-group.fm/www/icon/kein_release_vorhanden.png" alt="Dimitri Vegas, Like Mike, Moguai - Mammoth" title="Dimitri Vegas, Like Mike, Moguai - Mammoth" />
</div>
<div class="rc_release_list_item_picture">
 <img width="87" height="87" border="0" src="http://b.image.web.tb-group.fm/www/icon/kein_release_vorhanden.png" alt="Lefty, Reecey Boi - Tic Tac Toe" title="Lefty, Reecey Boi - Tic Tac Toe" />
</div>
<div class="rc_release_list_item_picture">
 <img width="87" height="87" border="0" src="http://b.image.web.tb-group.fm/www/icon/kein_release_vorhanden.png" alt="Deorro - Lose It" title="Deorro - Lose It" />
</div>
<div class="rc_release_list_item_picture">
 <a href="/release/fedde-le-grand-sultan-ned-shepard/21292/no-good/"><img width="87" height="87" border="0" src="http://n.image.weareone.fm/news/_newsgrafiken/2013/_releases/vesper/25-06-2013--fedde-le-grand-sultan-ned-shephard-no-good_s.png" alt="Fedde Le Grand &amp; Sultan &amp; Ned Shepard - No Good" title="Fedde Le Grand &amp; Sultan &amp; Ned Shepard - No Good" /></a>
</div>
<div class="rc_release_list_item_picture">
 <a href="/release/blasterjaxx/22557/fifteen/"><img width="87" height="87" border="0" src="http://o.image.weareone.fm/news/_newsgrafiken/2013/_releases/peeeya/09-09-2013--fifteen_s.png" alt="BlasterJaxx - Fifteen" title="BlasterJaxx - Fifteen" /></a>
</div>
<div class="rc_release_list_item_picture">
 <a href="/release/fun-k-house/21216/freed-from-desire/"><img width="87" height="87" border="0" src="http://m.image.weareone.fm/news/_newsgrafiken/2013/_releases/yjs/19-06-2013--fun-k-house-freed-from-desire_s.png" alt="Fun[k]House - Freed From Desire" title="Fun[k]House - Freed From Desire" /></a>
</div>
<div class="rc_release_list_item_picture">
 <img width="87" height="87" border="0" src="http://b.image.web.tb-group.fm/www/icon/kein_release_vorhanden.png" alt="The Aston Shuffle - Can&acute;t Stop Now" title="The Aston Shuffle - Can&acute;t Stop Now" />
</div>
<div class="rc_release_list_item_picture">
 <a href="/release/nicky-romero-feat-krewella/21560/legacy-save-my-life/"><img width="87" height="87" border="0" src="http://o.image.weareone.fm/news/_newsgrafiken/2013/_releases/yjs/17-07-2013--nicky-romero-feat-krewella-legacy-save-my-life_s.png" alt="Nicky Romero feat. Krewella - Legacy (Save my life)" title="Nicky Romero feat. Krewella - Legacy (Save my life)" /></a>
</div>
<div class="rc_release_list_item_picture">
 <a href="/release/hardwell-showtek/16817/how-we-do/"><img width="87" height="87" border="0" src="http://n.image.weareone.fm/news/_newsgrafiken/2012/_releases/manoua/04-09-2012--how-we-do_s.png" alt="Hardwell &amp; Showtek - How We Do" title="Hardwell &amp; Showtek - How We Do" /></a>
</div>
<div class="rc_release_list_item_picture">
 <a href="/release/tjr/21656/what-s-up-suckaz/"><img width="87" height="87" border="0" src="http://o.image.weareone.fm/news/_newsgrafiken/2013/_releases/moony/23-07-2013--tjr-what-s-up-suckaz_s.png" alt="TJR - What Up Suckaz" title="TJR - What Up Suckaz" /></a>
</div>
<div class="rc_release_list_item_picture">
 <a href="/release/major-lazer-feat-busy-signal-the-flexican-fs-green/20958/watch-out-for-this-bumaye/"><img width="87" height="87" border="0" src="http://o.image.weareone.fm/news/_newsgrafiken/2013/_releases/manoua/01-06-2013--watch-out-for-this_s.png" alt="Major Lazer feat. Busy Signal, The Flexican &amp; FS Green - Watch Out For This (Bumaye)" title="Major Lazer feat. Busy Signal, The Flexican &amp; FS Green - Watch Out For This (Bumaye)" /></a>
</div>
<div class="rc_release_list_item_picture">
 <a href="/release/w-w/20839/thunder/"><img width="87" height="87" border="0" src="http://n.image.weareone.fm/news/_newsgrafiken/2013/_releases/manoua/20-05-2013--thunder_s.png" alt="W&amp;W - Thunder" title="W&amp;W - Thunder" /></a>
</div>
<div class="rc_release_list_item_picture">
 <a href="/release/martin-garrix/21302/animals/"><img width="87" height="87" border="0" src="http://o.image.weareone.fm/news/_newsgrafiken/2013/_releases/manoua/08-08-2013--animals_s.png" alt="Martin Garrix - Animals" title="Martin Garrix - Animals" /></a>
</div>
BUILD SUCCESSFUL (total time: 3 seconds)

Я хочу отрезать ненужную часть и отобразить только имена песен, которые находятся в "alt=....". Я сделал это с JSoup:

Document doc = Jsoup.connect("http://www.housetime.fm/tracklist/").get();
Elements links = doc.getElementsByClass("rc_release_list_item_picture");
System.out.println(links);

Кто-нибудь есть идея, как это сделать?

Теги:
split
parsing

3 ответа

3
Лучший ответ

Этот код делает трюк:

public static void main(String[] args) throws IOException {
    Document doc = Jsoup.connect("http://www.housetime.fm/tracklist/").get();
    Elements links = doc.getElementsByClass("rc_release_list_item_picture");//Get all the divs
    Elements imgs = links.select("img[alt]");//get all images with alt attribute
    Iterator<Element> iterator = imgs.iterator();
    while (iterator.hasNext()) {
        Element element = (Element) iterator.next();
        String altString = element.attr("alt");//get the value of the alt attribute
        System.out.println(altString);
    }
}
  • 0
    Большое спасибо!
0

Я не использовал JSoup, но вот что я сделал за последние несколько минут с помощью Regex:

public class Main
{
private static String testStrings = "<div class=\"rc_release_list_item_picture\">\n" +
        " <a href=\"/release/fatboy-slim-and-riva-starr/21683/eat-sleep-rave-repeat/\"><img width=\"87\" height=\"87\" border=\"0\" src=\"http://n.image.weareone.fm/news/_newsgrafiken/2013/_releases/lames/24-07-2013--fatboy-slim-and-riva-starr-eat-sleep-rave-repeat_s.png\" alt=\"Fatboy Slim and Riva Starr - Eat, Sleep, Rave, Repeat\" title=\"Fatboy Slim and Riva Starr - Eat, Sleep, Rave, Repeat\" /></a>\n" +
        "</div>";
    public static void main(String[] args)
    {
        String regexPattern = "<img[^>]*alt=[\"]*([\\w\\s-.:\\/,]+)[\"]*[^>]*/>";
        Pattern p = Pattern.compile(regexPattern);
        Matcher m = p.matcher(testStrings);
        if(m.find())
        {
            System.out.println(m.group(1));
        }

    }
}

Выходом были Fatboy Slim and Riva Starr - Eat, Sleep, Rave, Repeat. Возможно, некоторые символы могут быть добавлены в [\w\s-.:\/,]+ Чтобы покрыть некоторые названия песен со странными символами.

0

Это ваше решение, оно выбирает все img из элемента .rc_release_list_item_picture и отображает его атрибут alt.

import org.jsoup.Jsoup;
import org.jsoup.helper.Validate;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;


Document doc = Jsoup.connect("http://www.housetime.fm/tracklist/").get();
Elements links = doc.getElementsByClass("rc_release_list_item_picture");
for(Element img: links.select("img[alt]")){
   System.out.println(img.attr("alt"));
}

Ещё вопросы

Сообщество Overcoder
Наверх
Меню