|
Q: Work
With HTML and the Clipboard
My application needs to place HTML on the clipboard, but I can't
figure out how to do this so that other applications understand
that's what it is. I've seen references to the HTML Clipboard
Format (CF_HTML), but I can't find the definition for that
constant. How should I proceed?
A:
Using the CF_HTML clipboard format with the Windows clipboard is
a bit confusing, in part because it's not a native clipboard
format; it's a registered format, so it isn't a
constant at all, because its value differs from system to
system. You can obtain registered clipboard-format values with a
simple API call—RegisterClipboardFormat. The first time this
function is called with a given string, it returns a unique
number in the range C000-FFFF. Each subsequent call that any
process running on the system makes returns the same value. The
magic string to use for this format is "HTML Format":
Private Declare Function _
RegisterClipboardFormat _
Lib "user32" _
Alias "RegisterClipboardFormatA" _
(ByVal lpString As String) As Long
Dim CF_HTML As Long
Const RegHtml As String = "HTML Format"
CF_HTML = _
RegisterClipboardFormat(RegHtml)
You must construct a descriptive header and prepend it to the
data before you can place your HTML data onto the clipboard.
This header provides other applications with the description's
version information, with offsets within the data where the HTML
starts and stops, and with information about where the actual
selection begins and ends. Conceptualize the selection by
considering a user who might select a portion of an HTML
document or even an element (such as a few rows in a table).
Other portions of the page (such as inline style definitions)
might be required to render the selection fully. You likely must
supply more than the raw selection to put HTML on the clipboard
in its full context. A sample header might look like this:
Version:1.0
StartHTML:000000258
EndHTML:000001491
StartFragment:000001172
EndFragment:000001411
Applications use the StartFragment and EndFragment attributes
to determine which data to paste, and they might or might not
use the remaining HTML to help format the selected portion. You
must inject HTML comments into the data to identify the selected
area further. Obviously, you must do this before you build the
final header, because the offsets won't be stable otherwise. The
opening/closing comment tags for the selected data are
"<!--StartFragment-->" and "<!--EndFragment-->",
respectively (see Listing 1).
I don't have enough room here to detail all of this header's
aspects, so I'll hit a few highlights and refer you to the
sample code and further reading (see Additional
Resources). You must keep several critical points in mind.
The offsets listed in the header are zero-based, so you must
adjust your string-manipulation routines accordingly. Also, if
you're reading as well as writing these headers, you must assume
that the number of digits is variable (for example, Internet
Explorer [IE] uses 9, and Word uses 10).
Finally, if you place only CF_HTML on the clipboard,
applications such as Word and FrontPage don't know what to do
with it. You must also supply a plain-text rendition of the
stylized HTML to the clipboard for these apps to behave as
expected. Scads of tools perform HTML-to-text conversions, or
the extremely macho might prefer to roll their own parsers. But,
no Windows programmer should ever have to hand-parse HTML again.
You can call upon the OS instead for this everyday task:
Public Function Html2Text(ByVal Data _
As String) As String
Dim obj As Object
On Error Resume Next
Set obj = _
CreateObject("htmlfile")
obj.Open
obj.Write Data
Html2Text = obj.Body.InnerText
End Function
Leveraging IE isn't necessarily the quickest method for
parsing HTML, but the expediency it offers is a good tradeoff in
this case. —K.E.P.
About the Author
Karl E. Peterson is a GIS analyst with a regional transportation
planning agency and serves as a member of the VSM
Editorial Advisory Board. Online, he's a Microsoft MVP and a
section leader on several DevX forums. Find more of Karl's VB
samples at www.mvps.org/vb.
|