-
-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Origin Private File System for the WASM version #113
Comments
Hi @pedrovgs, You can use a function like this to read from OPFS: async function fileToByteArray(fileName) {
try {
// Get directory handle
const dirHandle = await navigator.storage.getDirectory();
// Get file handle
const fileHandle = await dirHandle.getFileHandle(fileName);
// Get file from file handle
const file = await fileHandle.getFile();
// Read file as array buffer
const arrayBuffer = await file.arrayBuffer();
// Convert array buffer to byte array
const fileByteArray = Array.from(new Uint8Array(arrayBuffer));
return fileByteArray;
} catch (e) {
throw new Error(e);
}
}
// Usage example
(async () => {
try {
const byteArray = await fileToByteArray('example-file.txt');
console.log(byteArray);
} catch (e) {
console.error(e);
}
})(); Thanks. |
Thanks for the quick response @paulocoutinhox !! That's the solution we are using already. But we would like to avoid this because reading the array buffer from the file means we have to allocate in memory all the file content. For big pdfs that could be a lot of megabytes. We would like to use the API reading the file from the file system using the path so we can avoid some memory allocation when possible. Do you know if we can do this? Afaik emscripten should provide access to the file system but I don't know if they support opfs. LoadDocument function is available and you can invoke it, but the document load always fails when I tested it. |
Hi, You can use it, check: Example:
|
Hi, Project was updated to latest version. If your problem was solved, can you close the issue pls? Thanks. |
Hey @paulocoutinhox Paulo I'm going to close the issue but I don't think this solves the problem we had. We are looking for one API that doesn't require us to read all the file at once from OPFS. Because we manage huge PDFs and this may consume a lot of memory. |
Why not use this: #include "fpdfview.h"
#include "fpdf_doc.h"
#include "fpdf_text.h"
#include <iostream>
#include <fstream>
// custom read callback
size_t readBlock(void* param, unsigned long pos, unsigned char* pBuf, unsigned long size) {
std::ifstream* file = reinterpret_cast<std::ifstream*>(param);
if (!file->seekg(pos)) return 0;
file->read(reinterpret_cast<char*>(pBuf), size);
return file->gcount();
}
int main() {
// initialize pdfium
FPDF_InitLibrary();
std::ifstream file("your_file.pdf", std::ios::binary);
if (!file.is_open()) {
std::cerr << "failed to open the pdf file." << std::endl;
return -1;
}
// configure file access
FPDF_FILEACCESS fileAccess;
fileAccess.m_FileLen = file.seekg(0, std::ios::end).tellg();
file.seekg(0, std::ios::beg);
fileAccess.m_GetBlock = &readBlock;
fileAccess.m_Param = &file;
// load the pdf document using custom read access
FPDF_DOCUMENT document = FPDF_LoadCustomDocument(&fileAccess, nullptr);
if (!document) {
std::cerr << "failed to load the pdf document." << std::endl;
FPDF_DestroyLibrary();
return -1;
}
// get the total number of pages
int page_count = FPDF_GetPageCount(document);
std::cout << "total number of pages: " << page_count << std::endl;
// load each specific page (e.g., page 1)
for (int i = 0; i < page_count; ++i) {
FPDF_PAGE page = FPDF_LoadPage(document, i);
if (!page) {
std::cerr << "failed to load page " << i + 1 << "." << std::endl;
continue;
}
std::cout << "page " << i + 1 << " loaded successfully." << std::endl;
// here you can perform actions with the loaded page
// such as extracting text, rendering, etc.
// don't forget to close the page after use
FPDF_ClosePage(page);
}
// close the document after finishing
FPDF_CloseDocument(document);
// destroy pdfium
FPDF_DestroyLibrary();
return 0;
} |
Hi @CetinSert Can you help me change to "LoadCustomDocument"? I made a small change to test it: But im getting error. Thanks. |
Hi! I will take a look soon and edit this comment with what I come up with! |
Hi @paulocoutinhox. First of all, thank you so much for this awesome project!
I've been checking the WASM version including the demo provided and I'm wondering if you tried to load the files from OPFS instead of loading the file from a byte array. I'm trying to do it but I can't get it working and I don't even know if it is supported, so I'm asking just in case. Thanks!
The text was updated successfully, but these errors were encountered: