Skip to content

Commit

Permalink
Version 1.0.4
Browse files Browse the repository at this point in the history
1. Fix Declarative Shadow DOM elements not loaded properly.
2. Fix global CSS effects of HTML files would cause main program misarrange layout problem.
3. Change HTML Sanitization package to "sanitize-html" and only sanitize HTML attributes to prevent XSS attacks.
4. Isolate the content of HTML file with Shadow DOM and div style to prevent CSS Style Pollution.
5. Add SingleFileZ's compressed HTML-like file format support.
  • Loading branch information
nuthrash committed Nov 11, 2022
1 parent a739e9e commit 391644a
Show file tree
Hide file tree
Showing 6 changed files with 197 additions and 56 deletions.
25 changes: 24 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,4 +42,27 @@ This is a plugin for Obsidian (https://obsidian.md). Can open document with `.ht
- Cannot see local image files like `<img src="./image1.jpg" />` or `<img src="file:///C:/image1.jpg" />`
- This is Obsidian's constraint, it disallow to directly access local files through HTML code.
- One of the possible remedy ways is re-save the HTML file as a complete HTML file by dedicated browser extensions such as "[SingleFile](https://github.com/gildas-lormeau/SingleFile)", it can save a complete page (with CSS, images, fonts, frames, etc.) as a single HTML file. After got the complete HTML file, put it to obsidian-html-plugin installed vault folder then open it, you would see all images.
- Another remedy way is add `app://local/` or `app://local//` prefix string to `src` attribute by hands(refer to "[Allow embed of Local images using `![](file:///...)`](https://forum.obsidian.md/t/allow-embed-of-local-images-using-file/1990/4)"). However, this workaround code would be sanitized out after version 1.0.1 of obsidian-html-plugin, you shall take obsidian-html-plugin 1.0.0 to make it work.
- Another remedy way is add `app://local/` or `app://local//` prefix string to `src` attribute by hands(refer to "[Allow embed of Local images using `![](file:///...)`](https://forum.obsidian.md/t/allow-embed-of-local-images-using-file/1990/4)"). However, this workaround code would be sanitized out when using version 1.0.1 ~ 1.0.3 of obsidian-html-plugin, you shall take obsidian-html-plugin 1.0.0 or 1.0.4+ to make it work.

- After some .html files were opened, they look like blank pages and cannot see original contents.
- In fact, currently (after 1.0.4), this plugin can handle only some kinds of HTML files:
1. Standard [HTML5](https://html.spec.whatwg.org/) files
2. Compressed HTML-like files made by [SingleFileZ](https://github.com/gildas-lormeau/SingleFileZ)
- Therefore, when open unsupported file format, this plugin would notice related messages or show an almost blank page.
- "open document with `.html` and `.htm` file extensions" is the description written for end-users without technical background. It doesn't mean this plugin can open all kinds of files with .html or .htm file extensions, especially when the file actually is other document type but renamed to .html or .htm file extension.
- If you want to open an ePub file, you shall install "ePub Reader" plugin to open it, instead rename it to xxx.html then ask why this plugin cannot open it.

- Some HTML elements disappeared
- That might be caused by:
1. Removed by HTML Sanitization mechanism
2. Hide or become invisible
- You could try to manually install obsidian-html-plugin 1.0.0 to see if the disappeared HTML elements become visibile. If YES, you could create a new issue in [Issues page](https://github.com/nuthrash/obsidian-html-plugin/issues) to let me know, and I will discuss it with you.
- If you still cannot see the disappeared HTML elements after installing obsidian-html-plugin 1.0.0, that means they were hide or became invisible. This situation often occurs when the HTML element use some advanced features like "[Declarative Shadow DOM](https://web.dev/declarative-shadow-dom/)" (this feature has been supported after verion 1.0.4) and this plugin or Obsidian not supported yet. Then, you could create a new issue in [Issues page](https://github.com/nuthrash/obsidian-html-plugin/issues) to let me know, and I will discuss it with you.

- Almost all script code cannot work
- That might be caused by:
1. Blocked by HTML loading procedure
2. Removed by HTML Sanitization mechanism
- Obsidian's developer team is very concern about XSS attacks, so they want plugin developers follow this [tip](https://github.com/obsidianmd/obsidian-releases/blob/master/plugin-review.md#avoid-innerhtml-outerhtml-and-insertadjacenthtml) to prevent XSS attacks. Therefore, almost all script code resident inside `<script>` in the HTML file would be blocked, and the external script files are the same.
- Meanwhile, HTML Sanitization mechanism would sanitize potential XSS code more deeper. So, the code such as `<... onload="alert(1)">` would be removed.

2 changes: 1 addition & 1 deletion esbuild.config.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ esbuild.build({
...builtins],
format: 'cjs',
watch: !prod,
target: 'es2018',
target: 'es2020',
logLevel: "info",
sourcemap: prod ? false : 'inline',
treeShaking: true,
Expand Down
2 changes: 1 addition & 1 deletion manifest.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{
"id": "obsidian-html-plugin",
"name": "HTML Reader",
"version": "1.0.3",
"version": "1.0.4",
"minAppVersion": "0.15.0",
"description": "This is a HTML file reader plugin for Obsidian. Can open document with \".html\" and \".htm\" file extensions.",
"author": "Nuthrash",
Expand Down
6 changes: 4 additions & 2 deletions package.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "obsidian-html-plugin",
"version": "1.0.3",
"version": "1.0.4",
"description": "This is a HTML file reader plugin for Obsidian. Can open document with \".html\" and \".htm\" file extensions.",
"main": "main.js",
"scripts": {
Expand All @@ -27,6 +27,8 @@
"typescript": "4.7.4"
},
"dependencies": {
"dompurify": "^2.4.0"
"@zip.js/zip.js": "^2.6.52",
"sanitize-html": "^2.7.3",
"single-filez-core": "^1.0.26"
}
}
215 changes: 165 additions & 50 deletions src/HtmlView.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
import { WorkspaceLeaf, FileView, TFile, sanitizeHTMLToDom } from "obsidian";
import DOMPurify from 'dompurify';
import sanitizeHtml from 'sanitize-html';

import { extract } from "single-filez-core/processors/compression/compression-extract.js";
import * as zip from '@zip.js/zip.js';

export const HTML_FILE_EXTENSIONS = ["html", "htm"];
export const VIEW_TYPE_HTML = "html-view";
Expand All @@ -21,66 +24,83 @@ export class HtmlView extends FileView {
this.contentEl.empty();

try {
// whole HTML file strings
const contents = await this.app.vault.read(file);
// whole HTML file ArrayBuffer
const contents = await this.app.vault.readBinary(file);

// Obsidian's HTMLElement and Node API: https://github.com/obsidianmd/obsidian-api/blob/master/obsidian.d.ts

// https://github.com/cure53/DOMPurify
// https://github.com/DefinitelyTyped/DefinitelyTyped/blob/master/types/dompurify/index.d.ts
const purifyConfig = {
RETURN_DOM: true, // return DOM object
WHOLE_DOCUMENT: true, // include <html>

// Default TAGs ATTRIBUTEs allow list & blocklist https://github.com/cure53/DOMPurify/wiki/Default-TAGs-ATTRIBUTEs-allow-list-&-blocklist
// allowed tags https://github.com/cure53/DOMPurify/blob/main/src/tags.js
ADD_TAGS: ['link'], // allow external css files
// allowed attributes https://github.com/cure53/DOMPurify/blob/main/src/attrs.js
};
let htmlStr = null;

DOMPurify.addHook('afterSanitizeAttributes', function (node) {
if(node.nodeName) {
switch(node.nodeName) {
// disable some interactive elements to avoid XSS attacks
// case 'INPUT':
// case 'BUTTON':
// case 'TEXTAREA':
// case 'SELECT':
// node.setAttribute('disabled', 'disabled');
// break;
try {
// the HTML file made by SingleFileZ
globalThis.zip = zip;
const { docContent } = await extract(new Blob([new Uint8Array(contents)]), { noBlobURL: true });

htmlStr = docContent;
} catch {
// the HTML file not made by SingleFileZ
const decoder = new TextDecoder();
htmlStr = decoder.decode(contents); // decode with UTF8
}

// https://github.com/apostrophecms/sanitize-html
// https://github.com/DefinitelyTyped/DefinitelyTyped/blob/master/types/sanitize-html/index.d.ts
const purifyConfig = {
allowedTags: false, // allow all tags

case 'BODY':
// avoid some HTML files unable to scroll, only when 'overflow' is not set
if( node.style.overflow === '' )
node.style.overflow = 'auto';
// allowedAttributes: false, // allow all attributes // {}, // disallow all attributes
allowedAttributes: {
'*': ALLOWED_ATTRS
},

allowedClasses: false, // allow all classes
// allowedStyles: false, // allow all styles

// avoid some HTML files unable to select text, only when 'user-select' is not set
if( node.style.userSelect === '' )
node.style.userSelect = 'text';
break;
}
}
});
allowedIframeHostnames: false, // allow all Iframe Hostnames ['www.youtube.com', 'player.vimeo.com']

// default allowed schemes: http, https, ftp, mailto
allowedSchemes: sanitizeHtml.defaults.allowedSchemes.concat([ 'app', 'callto', 'cid', 'data', 'ftps', 'tel', 'xmpp' ])
};

const cleanHtml = sanitizeHtml( htmlStr, purifyConfig );

// using Obsidian's internal DOMParser to build Declarative Shadow DOM
const domW = new window.DOMParser().parseFromString(cleanHtml, 'text/html', { includeShadowRoots: true });

/*
// Obsidian's internal DOMPurify module
// const cleanDom = window.DOMpurify.sanitize( contents, purifyConfig ); // Cannot read properties of undefined(read 'sanitize')
await applyPatches( domW );

// window.DOMPurify.addHook('afterSanitizeAttributes'........
const contentDiv = this.contentEl.createDiv();

// Some internal links like "#cite_1" would be replaced as "app://obsidian.md/index.html#cite_1"
window.DOMPurify.clearConfig();
window.DOMPurify.setConfig( purifyConfig );
const cleanDom = sanitizeHTMLToDom( contents );
this.contentEl.appendChild( cleanDom );
window.DOMPurify.clearConfig();
window.DOMPurify.removeHook( 'afterSanitizeAttributes' );
*/
// using Shadow DOM element and CSS style attr. to isolate the contents of HTML file to avoid CSS Style Pollution
contentDiv.setAttribute( 'style', 'transform: scale(1);' );
let shadow = contentDiv.attachShadow({mode: 'open'});

const cleanDom = DOMPurify.sanitize( contents, purifyConfig );
this.contentEl.appendChild( cleanDom );
DOMPurify.removeHook( 'afterSanitizeAttributes' );
// while clicking, fix internal links(in-place anchor) replaced by Obsidian at runtime
shadow.addEventListener('click', (event) => {
const elem = event.target, appAddr = "app://obsidian.md/index.html#";

function scrollAnchorRecursive(node) {
if( node == null || node.nodeName == null || node.nodeName === "BODY" )
return;

if( node.nodeName === 'A' ) {
if( node.href && node.href.startsWith(appAddr) ) {
const idInteral = decodeURIComponent( node.href.slice(appAddr.length) );

const targetElem = node.getRootNode().getElementById( idInteral );
if( targetElem )
targetElem.scrollIntoView();
}
} else {
return scrollAnchorRecursive(node.parentNode);
}
}

return scrollAnchorRecursive(elem)
});

shadow.appendChild( domW.documentElement );

} catch (error) {
showError(error);
}
Expand All @@ -103,6 +123,101 @@ export class HtmlView extends FileView {
}
}

// return Map's each item => [0] for variableName, [1] for it's valueContent
async function cutCssVariables( doc: HTMLDocument, cssElementName: string, removeVar: boolean ) : Map<string, string> {
let map = new Map<string, string>();

// Obsidian's internal DOMParser does not genreate HTMLDocument's styleSheets property,
// therefore all style sheets shall be collected by other way!

let allStyles = doc.getElementsByTagName('style');
if( !allStyles || allStyles.length <= 0 )
return map;

let removeSet = new Set<CSSStyleDeclaration>();
Array.from(allStyles).forEach( (styleEle) => {
try {
Array.from(styleEle.sheet.cssRules).forEach((rule) => {
// type 1 is CSSStyleRule https://developer.mozilla.org/en-US/docs/Web/API/CSSStyleRule
if( rule.type != 1 || rule.selectorText !== cssElementName )
return;

// rule.style is CSSStyleDeclaration https://developer.mozilla.org/en-US/docs/Web/API/CSSStyleDeclaration
for( const propName of rule.style ) {
let pn = propName.trim();
if( !pn.startsWith("--") )
continue;

if( !map.has(pn) ) {
map.set( pn, rule.style.getPropertyValue(propName).trim() );
}
if( removeVar && !removeSet.has(rule.style) )
removeSet.add( rule.style );
}
});
} catch {
//ignore different domain of styleSheet.href
}
});

// remove old variables and its content
if( removeVar && removeSet.size > 0 && map.size > 0 ) {
for( const style of removeSet ) {
for( const kvp of map ) {
if( style.cssText.contains(kvp[0]) )
style.removeProperty( kvp[0] );
}
}
}

return map;
}

async function applyPatches( doc: HTMLDocument ): Promise<void> {
const bodyEle = doc.body;

// avoid some HTML files unable to scroll, only when 'overflow' is not set
if( bodyEle.style.overflow === '' )
bodyEle.style.overflow = 'auto';
// avoid some HTML files unable to select text, only when 'user-select' is not set
if( bodyEle.style.userSelect === '' )
bodyEle.style.userSelect = 'text';

// fix CSS :root global variables to :host for Shadow DOM
let cssVars = await cutCssVariables( doc, ":root", true );
if( cssVars.size > 0 ) {
const hostVars = Array.from(cssVars).map( cssVar => `${cssVar[0]}: ${cssVar[1]}` ).join('; ') + ";";
const styleEle = doc.createElement( "style" );
styleEle.innerText = `:host { ${hostVars} }`;
bodyEle.appendChild( styleEle );
}

// ESLint
Array.from(doc.links).forEach( (ele) => {
if( ele.instanceOf(HTMLAnchorElement) && ele.getAttribute("target") === "_blank" )
ele.setAttribute( "rel", "noreferrer noopener" );
});
}

export const ALLOWED_ATTRS = [
// default allowed attributes of sanitize-html
'center', 'target',

// extra allowed attributes from DOMPurify
// https://github.com/cure53/DOMPurify/blob/main/src/attrs.js
'accept', 'action', 'align', 'alt', 'autocapitalize', 'autocomplete', 'autopictureinpicture', 'autoplay', 'background', 'bgcolor', 'border', 'capture', 'cellpadding', 'cellspacing', 'checked', 'cite', 'class', 'clear', 'color', 'cols', 'colspan', 'controls', 'controlslist', 'coords', 'crossorigin', 'datetime', 'decoding', 'default', 'dir', 'disabled', 'disablepictureinpicture', 'disableremoteplayback', 'download', 'draggable', 'enctype', 'enterkeyhint', 'face', 'for', 'headers', 'height', 'hidden', 'high', 'href', 'hreflang', 'id', 'inputmode', 'integrity', 'ismap','kind', 'label', 'lang', 'list', 'loading', 'loop', 'low', 'max', 'maxlength', 'media', 'method', 'min','minlength', 'multiple', 'muted', 'name', 'nonce', 'noshade', 'novalidate', 'nowrap', 'open', 'optimum', 'pattern', 'placeholder', 'playsinline', 'poster', 'preload', 'pubdate', 'radiogroup', 'readonly', 'rel', 'required', 'rev', 'reversed', 'role', 'rows', 'rowspan', 'spellcheck', 'scope', 'selected', 'shape', 'size', 'sizes', 'span', 'srclang', 'start', 'src', 'srcset', 'step', 'style', 'summary', 'tabindex', 'title', 'translate', 'type', 'usemap', 'valign', 'value', 'width', 'xmlns', 'slot',
// SVG
'accent-height', 'accumulate', 'additive', 'alignment-baseline', 'ascent', 'attributename', 'attributetype', 'azimuth', 'basefrequency', 'baseline-shift', 'begin', 'bias', 'by', 'class', 'clip', 'clippathunits', 'clip-path', 'clip-rule', 'color', 'color-interpolation', 'color-interpolation-filters', 'color-profile', 'color-rendering', 'cx', 'cy', 'd', 'dx', 'dy', 'diffuseconstant', 'direction', 'display', 'divisor', 'dur', 'edgemode', 'elevation', 'end', 'fill', 'fill-opacity', 'fill-rule', 'filter', 'filterunits', 'flood-color','flood-opacity', 'font-family', 'font-size', 'font-size-adjust', 'font-stretch', 'font-style', 'font-variant', 'font-weight', 'fx', 'fy', 'g1', 'g2', 'glyph-name', 'glyphref', 'gradientunits', 'gradienttransform', 'image-rendering', 'in', 'in2', 'k', 'k1', 'k2', 'k3', 'k4', 'kerning', 'keypoints', 'keysplines', 'keytimes', 'lengthadjust', 'letter-spacing', 'kernelmatrix', 'kernelunitlength', 'lighting-color', 'local', 'marker-end', 'marker-mid', 'marker-start', 'markerheight', 'markerunits', 'markerwidth', 'maskcontentunits', 'maskunits', 'mask', 'mode', 'numoctaves', 'offset', 'operator', 'opacity', 'order', 'orient', 'orientation', 'origin', 'overflow', 'paint-order', 'path', 'pathlength', 'patterncontentunits', 'patterntransform', 'patternunits', 'points', 'preservealpha', 'preserveaspectratio', 'primitiveunits', 'r', 'rx', 'ry', 'radius', 'refx', 'refy', 'repeatcount', 'repeatdur', 'restart', 'result', 'rotate', 'scale', 'seed', 'shape-rendering', 'specularconstant', 'specularexponent', 'spreadmethod', 'startoffset', 'stddeviation', 'stitchtiles', 'stop-color', 'stop-opacity', 'stroke-dasharray', 'stroke-dashoffset', 'stroke-linecap', 'stroke-linejoin', 'stroke-miterlimit', 'stroke-opacity', 'stroke', 'stroke-width', 'surfacescale', 'systemlanguage', 'tabindex', 'targetx', 'targety', 'transform', 'transform-origin', 'text-anchor', 'text-decoration', 'text-rendering', 'textlength', 'u1', 'u2', 'unicode', 'values', 'viewbox', 'visibility', 'version', 'vert-adv-y', 'vert-origin-x', 'vert-origin-y', 'word-spacing', 'wrap', 'writing-mode', 'xchannelselector', 'ychannelselector', 'x', 'x1', 'x2', 'y', 'y1', 'y2', 'z', 'zoomandpan',
// mathML
'accent', 'accentunder', 'bevelled', 'close', 'columnsalign', 'columnlines', 'columnspan', 'denomalign', 'depth', 'displaystyle', 'encoding', 'fence', 'frame', 'largeop', 'length', 'linethickness', 'lspace', 'lquote', 'mathbackground', 'mathcolor', 'mathsize', 'mathvariant', 'maxsize', 'minsize', 'movablelimits', 'notation', 'numalign', 'open', 'rowalign', 'rowlines', 'rowspacing', 'rowspan', 'rspace', 'rquote', 'scriptlevel', 'scriptminsize', 'scriptsizemultiplier', 'selection', 'separator', 'separators', 'stretchy', 'subscriptshift', 'supscriptshift', 'symmetric', 'voffset',
// XML
'xlink:href', 'xml:id', 'xlink:title', 'xml:space', 'xmlns:xlink',

// default allowed attributes by this plugin
'async', 'charset', 'collapse', 'collapsed', 'content', 'data', 'defer', 'external', 'http-equiv', 'property', 'sandbox', 'scoped', 'shadowroot', 'text', 'url', 'var',
'aria-*', 'data-*', 'href-*', 'src-*', 'style-*',
];

export async function showError(e: Error): Promise<void> {
const notice = new Notice("", 8000);
// @ts-ignore
Expand Down
3 changes: 2 additions & 1 deletion versions.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,6 @@
"1.0.0": "0.15.0",
"1.0.1": "0.15.0",
"1.0.2": "0.15.0",
"1.0.3": "0.15.0"
"1.0.3": "0.15.0",
"1.0.4": "0.15.0"
}

0 comments on commit 391644a

Please sign in to comment.