In the first two parts I covered setting up our tools, and creating a code template to build on. The next big challenge is replacing some of the built-in facilities of Javascript, like string handling, regular expressions and DOM manipulation, with C++ substitutes.
For string handling, watch out for the encoding! Most code examples use 8 bit ASCII strings, but Firefox supports Unicode strings, which allow a lot more languages to be represented. If we want a wide audience for our extension, we'll need to support them too.
C++ inherits C's built-in strings, as either char (for ASCII )or wchar_t (for Unicode) pointers. These are pretty old-fashioned and clunky to use, doing common operations like appending two strings involves explicit function calls, and you have to manually manage the memory allocated for them.
We should use the STL's string class, std::wstring, instead. This is the Unicode version of std::string, and supports all the same operations, including append just by doing "+". The equivalent for indexOf() is find(), which returns std::wstring::npos rather than -1 if the substring is not found. lastIndexOf() is similarly matched by find_last_of(). The substring() method is closely matched by the substr() call, but beware, the second argument is the length of the substring you want, not the index of the last character as in JS!
For regular expressions, our best bet is the Boost Regex library. You'll need to download and install boost to use it but luckily the windows installer is very painless. Once that's done, we can use the boost::wregex object to do Unicode regular expression work (the boost::regex only handles ASCII). One pain dealing with REs in C++ is that you have to use double slashes in the string literals you set them up with, so that to get the expression \?, you need a literal "\\?", since the compiler otherwise treats the slash as the start of a C character escape. The regular expressions functions themselves are a bit different than Javascript's; regex_match() only returns true if the whole string matches the RE, and regex_search() is the one to use for repeated searches.
DOM maniplation is possible through the MSHTML collection of interfaces. IHTMLDocument3 is a good start, it supports a lot of familiar functions such as getElementsByTagName and getElementById. It does involve a lot of COM query-interfacing to work with the DOM, so I'd recommend using ATL pointers to handle some of the house-keeping with reference counts and casting.
PeteSearch is now detecting search page loads, and extracting the search terms and links from the document, next we'll look at XMLHttpRequest-style loading from within a BHO.
Can you shed some light on how to package this application for other people to download and "add-on" w/o going through the compiler? Thanks
Posted by: Dave | June 19, 2007 at 06:50 PM
Hi Dave,
if you're interested in PeteSearch, I'm working on converting it to Internet Explorer, and it's still not completely working yet. You can try it now on Firefox at http://petesearch.com/ though.
If you're thinking more generally, about how to create your own add-on for IE, you'll need to use a compiler to build one.
Sorry if I'm misunderstanding your question,. Feel free to email me with more details if this doesn't help.
Posted by: Pete Warden | June 19, 2007 at 11:06 PM
Hi Pete,
Thanks for your reply. I actually followed your step through until I tried to use clickonce, which is supposed to be supported in the VC++ Express version. But I cannot find the Publish Tab in the project properties. And the "Deploy" option (if you right click the project in the solution explorer) is disabled (faded).
This is not what the VC++ Express msdn documents describes. I was wondering if I missed something.
Thanks.
Posted by: Dave | June 20, 2007 at 12:00 PM
Ah, I think I understand now. I wasn't aware of ClickOnce, that does look very convenient. Since installing a BHO requires registry access, that may complicate things a bit. I'll look into this a bit more, and let you know what I find out.
My current solution is to use the NSIS install system, since I've used it in the past, it's free, and well-supported.
I've got an NSIS script that just calls RegDLL PeteSearch.dll to do the registry magic. I'll be posting my current PeteSearch code soon, and will include that script.
If you're interested in NSIS, here's a good starting article:
http://en.wikipedia.org/wiki/Nullsoft_Scriptable_Install_System
Posted by: Pete Warden | June 20, 2007 at 04:00 PM