Results 11 to 20 of 39
-
4th Jul 2010, 06:24 AM #11Member
try using some logic where the angular brackets close & the text starts with out opening angular bracket then u store each char in a character array using pointer.... i can do it in C...didnt start C# yet
-
4th Jul 2010, 08:50 AM #12Respected DeveloperWebsite's:
PlatinumW.org NexusDDL.com HD-United.org CheckLinks.org FLVD.orgI think regex should do the trick
Current projects:
Megaupload Premium Multifetch Script | FF Plugin: Tinypic and Imagevenue Image Remoter
Projects in hiatus:
IPB Linkchecker Bot | VB Linkchecker Bot
-
4th Jul 2010, 11:06 AM #13
-
4th Jul 2010, 11:23 AM #14Member
It's relatively simple actually. You can use RegEx as mentioned already, here's an example for the "<a tag><c tag>xyz.abc</c></a>" string.
Code:string StringToSearch = "<a tag><c tag>xyz.abc</c></a>"; string StringFound = Regex.Match(StringToSearch, "<a tag><c tag>(.*)<\/c><\/a>").Groups.Item(1).Value; MessageBox.Show(StringFound);
You can adopt the code yourself for whatever stuff you need
If you need all found matches, use the Matches function instead of Match, then mess around with the Groups and Item array followed by a foreach loop.
-
4th Jul 2010, 02:22 PM #15Respected Developer
Never use regex for parsing markup. Download SharpLeech and add a reference to the Engine dll in your project. Then add these using's in your code:
PHP Code:using Hyperz.SharpLeech.Engine.Html;
using Hyperz.SharpLeech.Engine.Net;
PHP Code:var html = new HtmlDocument();
// load the html
html.LoadHtml("<div class=\"example\">foo</div>");
// use XPath to select the div
var node = html.DocumentNode.SelectSingleNode("//div[@class='example']");
var divContent = HttpUtility.HtmlDecode(node.InnerText);
-
4th Jul 2010, 03:27 PM #16OPMemberWebsite's:
InstantRDP.com
-
4th Jul 2010, 03:36 PM #17Respected Developer
Can't find what files? You only need Hyperz.SharpLeech.Engine.dll. And nope, the example extracts the word foo. Take a look at XPath via the link I posted.
Regarding the other question:
PHP Code:var html = new HtmlDocument();
// load the html
html.LoadHtml(yourHtmlHere);
// use XPath to select all "A" elements from the html
var anchors = html.DocumentNode.SelectNodes("//a");
// filter out those that start with http
var filter = from a in anchors
where a.GetAttributeValue("href", "").StartsWith("http")
select a;
-
5th Jul 2010, 03:36 PM #18Respected DeveloperWebsite's:
PlatinumW.org NexusDDL.com HD-United.org CheckLinks.org FLVD.orgCurrent projects:
Megaupload Premium Multifetch Script | FF Plugin: Tinypic and Imagevenue Image Remoter
Projects in hiatus:
IPB Linkchecker Bot | VB Linkchecker Bot
-
5th Jul 2010, 03:53 PM #19Respected Developer
Cleaner? That sounds like something a VB6 coder would say
. You being a coder should know that you can't use regex for parsing markup. For one it is much to slow for that. And secondly your expressions are static. It can't handle changes in the DOM structure without having to redo it all. Then there is the issue of inner html, etc etc.
The only case in which you can use regex is when you need only 1 simple string from a small html document of which you know the contents wont change. For anything else it'll change into an slow unmanageable mess. I'd be more happy to put this to the test.
-
5th Jul 2010, 04:00 PM #20Respected DeveloperWebsite's:
PlatinumW.org NexusDDL.com HD-United.org CheckLinks.org FLVD.orgUsing a DOM parser is faster than regex? I thought DOM parsers used regex
Anyway those parsers use up too much memory. For his case regex is simple - and using a parser is overkill
Current projects:
Megaupload Premium Multifetch Script | FF Plugin: Tinypic and Imagevenue Image Remoter
Projects in hiatus:
IPB Linkchecker Bot | VB Linkchecker Bot
Sponsored Links
Thread Information
Users Browsing this Thread
There are currently 1 users browsing this thread. (0 members and 1 guests)
Similar Threads
-
extracting data from diffrent site
By zebono2 in forum Web Development AreaReplies: 1Last Post: 28th Jul 2012, 06:22 AM -
C++ string search help needed
By googleplus in forum Web Development AreaReplies: 0Last Post: 12th May 2012, 04:42 PM -
How to recover deleted or lost data, file, photo on Mac with Data Recovery software
By Jack20126 in forum General DiscussionReplies: 0Last Post: 20th Dec 2011, 03:37 AM -
php string - heredoc syntax
By desiboy in forum Web Development AreaReplies: 3Last Post: 16th Nov 2010, 05:15 PM -
[c#] Get String In between strings
By jayfella in forum Web Development AreaReplies: 3Last Post: 16th Jun 2010, 11:23 PM
themaLeecher - leech and manage...
Version 5.03 released. Open older version (or...