Activity Stream
48,167 MEMBERS
6862 ONLINE
besthostingforums On YouTube Subscribe to our Newsletter besthostingforums On Twitter besthostingforums On Facebook besthostingforums On facebook groups

Page 2 of 4 FirstFirst 1234 LastLast
Results 11 to 20 of 39
  1.     
    #11
    Member
    try using some logic where the angular brackets close & the text starts with out opening angular bracket then u store each char in a character array using pointer.... i can do it in C...didnt start C# yet

  2.   Sponsored Links

  3.     
    #12
    Respected Developer
    Website's:
    PlatinumW.org NexusDDL.com HD-United.org CheckLinks.org FLVD.org
    I think regex should do the trick

  4.     
    #13
    Member
    Website's:
    InstantRDP.com
    Yeah, thanks Dman. It helped out.




  5.     
    #14
    Member
    It's relatively simple actually. You can use RegEx as mentioned already, here's an example for the "<a tag><c tag>xyz.abc</c></a>" string.

    Code: 
    string StringToSearch = "<a tag><c tag>xyz.abc</c></a>";
    string StringFound = Regex.Match(StringToSearch, "<a tag><c tag>(.*)<\/c><\/a>").Groups.Item(1).Value;
    MessageBox.Show(StringFound);


    You can adopt the code yourself for whatever stuff you need
    If you need all found matches, use the Matches function instead of Match, then mess around with the Groups and Item array followed by a foreach loop.

  6.     
    #15
    Respected Developer
    Never use regex for parsing markup. Download SharpLeech and add a reference to the Engine dll in your project. Then add these using's in your code:

    PHP Code: 
    using Hyperz.SharpLeech.Engine.Html;
    using Hyperz.SharpLeech.Engine.Net
    Now you can use it like:
    PHP Code: 
    var html = new HtmlDocument();

    // load the html
    html.LoadHtml("<div class=\"example\">foo</div>");

    // use XPath to select the div
    var node html.DocumentNode.SelectSingleNode("//div[@class='example']");
    var 
    divContent HttpUtility.HtmlDecode(node.InnerText); 
    XPath info: http://www.w3schools.com/xpath/default.asp

  7.     
    #16
    Member
    Website's:
    InstantRDP.com
    Where to put this file - Hyperz.SharpLeech.Engine.dll


    I guess your code is extracting the word "example" from between //div i.e. <div> tags.
    But how to extract links those are starting from http and ends with .extension




  8.     
    #17
    Respected Developer
    Can't find what files? You only need Hyperz.SharpLeech.Engine.dll. And nope, the example extracts the word foo. Take a look at XPath via the link I posted.

    Regarding the other question:
    PHP Code: 
    var html = new HtmlDocument();

    // load the html
    html.LoadHtml(yourHtmlHere);

    // use XPath to select all "A" elements from the html
    var anchors html.DocumentNode.SelectNodes("//a");

    // filter out those that start with http
    var filter from a in anchors
                 where a
    .GetAttributeValue("href""").StartsWith("http")
                 
    select a
    Just experiment with it.

  9.     
    #18
    Respected Developer
    Website's:
    PlatinumW.org NexusDDL.com HD-United.org CheckLinks.org FLVD.org
    Quote Originally Posted by Hyperz View Post
    Never use regex for parsing markup. Download SharpLeech and add a reference to the Engine dll in your project. Then add these using's in your code:

    PHP Code: 
    using Hyperz.SharpLeech.Engine.Html;
    using Hyperz.SharpLeech.Engine.Net
    Now you can use it like:
    PHP Code: 
    var html = new HtmlDocument();

    // load the html
    html.LoadHtml("<div class=\"example\">foo</div>");

    // use XPath to select the div
    var node html.DocumentNode.SelectSingleNode("//div[@class='example']");
    var 
    divContent HttpUtility.HtmlDecode(node.InnerText); 
    XPath info: http://www.w3schools.com/xpath/default.asp
    I beg to differ - regex is much cleaner lol

  10.     
    #19
    Respected Developer
    Cleaner? That sounds like something a VB6 coder would say . You being a coder should know that you can't use regex for parsing markup. For one it is much to slow for that. And secondly your expressions are static. It can't handle changes in the DOM structure without having to redo it all. Then there is the issue of inner html, etc etc.

    The only case in which you can use regex is when you need only 1 simple string from a small html document of which you know the contents wont change. For anything else it'll change into an slow unmanageable mess. I'd be more happy to put this to the test .

  11.     
    #20
    Respected Developer
    Website's:
    PlatinumW.org NexusDDL.com HD-United.org CheckLinks.org FLVD.org
    Using a DOM parser is faster than regex? I thought DOM parsers used regex Anyway those parsers use up too much memory. For his case regex is simple - and using a parser is overkill

Page 2 of 4 FirstFirst 1234 LastLast

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. extracting data from diffrent site
    By zebono2 in forum Web Development Area
    Replies: 1
    Last Post: 28th Jul 2012, 06:22 AM
  2. C++ string search help needed
    By googleplus in forum Web Development Area
    Replies: 0
    Last Post: 12th May 2012, 04:42 PM
  3. Replies: 0
    Last Post: 20th Dec 2011, 03:37 AM
  4. php string - heredoc syntax
    By desiboy in forum Web Development Area
    Replies: 3
    Last Post: 16th Nov 2010, 05:15 PM
  5. [c#] Get String In between strings
    By jayfella in forum Web Development Area
    Replies: 3
    Last Post: 16th Jun 2010, 11:23 PM

Tags for this Thread

BE SOCIAL