Commit Graph

367 Commits

Author SHA1 Message Date
CalDescent
57e82b62a1 Increased the capabilities of the service validation functions. 2021-12-05 13:03:22 +00:00
CalDescent
94b17eaff3 Added unit test for random file deletion, and fixed some issues found via the test. 2021-12-04 16:36:20 +00:00
CalDescent
a3038da3d7 Moved some shared arbitrary test methods to a new ArbitraryUtils class. 2021-12-04 14:26:10 +00:00
CalDescent
36c5b71656 Delete data associated with a name at random, if a name is using more than its allocated limit.
This would happen if a name fills their limit, and then additional names are followed. Alternatively it could happen if the total storage capacity reduces due to disk space being used by other apps. Chunks are deleted at random to reduce the chance of the same chunk being deleted everywhere. Data loss is possible here for transactions that don't have many peers. We'll have to see in practice how much of a problem this is, but it's better than the scenario where one content creator consumes all space on their followers' nodes, leaving no space for other names that are subsequently followed.
2021-12-04 14:23:09 +00:00
CalDescent
a320bea68a Limit the amount of data that can be stored per name.
This is calculated by the total capacity divided by the number of names the node follows. The idea here is that a single content creator can't upload terabytes of data and consume all the space on their followers' nodes. They can only use a proportion, with equal space given to each followed name. And since the limit is dynamic, following more names reduces the allocation to existing names.
2021-12-04 13:33:45 +00:00
CalDescent
a2cac003a4 Major rework of chunk hashes
Chunk hashes are now stored off chain in a metadata file. The metadata file's hash is then included in the transaction.

The main benefits of this approach are:
1. We no longer need to limit the total file size, because adding more chunks doesn't increase the transaction size.
2. This increases the chain capacity by a huge amount - a 512MB file would have previously increased the transaction size by 16kB, whereas it now requires only an additional 32 bytes.
3. We no longer need to use variable difficulty; every transaction is the same size and so the difficulty can be constant no matter how large the files are.
4. Additional metadata (such as title, description, and tags) can ultimately be stored in the metadata file, as apposed to using a separate transaction & resource.
5. There is also scope for adding hashes of individual files into the metadata file, if we ever wanted to allow single files to be requested without having to download and build the entire resource. Although this is unlikely to be available in the short term.

The only real negative is that we now how to fetch the metadata file before we know anything about the chunks for a transaction. This seems to be quite a small trade off by comparison.

Since we're not live yet, there is no backwards support for on-chain hashes, so a new data testchain will be required. This hasn't been tested outside of unit tests yet, so there will likely be several fixes needed before it is stable.
2021-12-01 12:37:21 +00:00
CalDescent
b7ee00fb22 Fixed errors in Admin API tests due to failing authentication. 2021-11-27 20:08:59 +00:00
CalDescent
ef2ee20820 Merge remote-tracking branch 'qortal/master'
# Conflicts:
#	pom.xml
#	src/main/java/org/qortal/api/resource/ListsResource.java
#	src/main/java/org/qortal/list/ResourceList.java
#	src/main/java/org/qortal/list/ResourceListManager.java
#	src/main/java/org/qortal/transaction/ChatTransaction.java
2021-11-27 19:41:17 +00:00
CalDescent
8e36c456e1 Wait for storage space to be calculated before running storage policy tests. 2021-11-27 18:06:48 +00:00
CalDescent
4b8bcd265b Make sure unit test use a different lists directory, and delete it before and after each test. 2021-11-27 18:05:25 +00:00
CalDescent
0db681eeda Fixed failing storage policy tests due to not calculating the available storage 2021-11-27 17:56:34 +00:00
CalDescent
bc38184ebf Major rework of local data directory structure
Files are now keyed by signature, in the format:
data/si/gn/signature/hash

For times when there is no signature available (i.e. at the time of initial upload), files are keyed by hash, in the format:
data/_misc/ha/sh/hash

Files in the _misc folder are subsequently relocated to a path that is keyed by the resulting signature.

The end result is that chunks are now grouped on the filesystem by signature. This allows more transparency as to what is being hosted, and will also help simplify the reporting and management of local files.
2021-11-27 13:00:32 +00:00
CalDescent
ae0f01d326 Added storage policy unit tests 2021-11-24 11:02:54 +00:00
CalDescent
af8d0a3965 Separated computeNonce() from build() in the transaction builder.
This gives the option of the nonce to be computed elsewhere, such as in the UI, and also allows transaction unit tests to run much more quickly.
2021-11-24 11:02:17 +00:00
CalDescent
1b170c74c0 Modified storage code to support 2 new settings:
publicDataEnabled - whether to store decryptable data (default true)
privateDataEnabled - whether to store data without a decryption key (default false)
2021-11-24 09:38:18 +00:00
CalDescent
73e609fa29 Replaced all IllegalStateException with DataException in arbitrary code
This was necessary to ensure that all exceptions are caught intentionally, as otherwise it creates endless amounts of edge cases.
2021-11-19 21:42:03 +00:00
CalDescent
3860c5d8ec Fixed some failing tests. 2021-11-19 16:12:31 +00:00
CalDescent
fb09d77cdc Rework of "Service" types to allow for validation
Each service supports basic validation params, plus has the option for an entirely custom validation function.

Initial validation settings:
- IMAGE must be less than 10MiB
- THUMBNAIL must be less than 500KiB
- METADATA must be less than 10KiB and must contain JSON keys "title", "description", and "tags"
2021-11-16 19:28:25 +00:00
CalDescent
c069c39ce1 Implemented automatic PUT/PATCH detection
When using POST /arbitrary/{service}/{name}... it will now automatically decide which method to use (PUT/PATCH) based on a few factors:

- If there are already 10 or more layers, use PUT to reset back to a single layer
- If the next layer's patch is more than 20% of the total resource file size, use PUT
- If the next layer modifies more than 50% of the total file count, use PUT
- Otherwise, use PATCH

The PUT method causes a new base layer to be created and all previous update history for that resource becomes obsolete. The PATCH method adds a small delta layer on top of the existing layer(s).

The idea is to wipe the slate clean with a new base layer once the patches start to get demanding for the network to apply. Nodes which view the content will ultimately have build timeouts to prevent someone from deploying a resource with hundreds of complex layers for example, so this approach is there to maximize the chances of the resource being buildable.

The constants above (10 layers, 20% total size, 50% file count) will most likely need tweaking once we have some real world data.
2021-11-13 09:56:13 +00:00
CalDescent
caf163f98c Include "tempDataPath" in test settings so that tests don't put files in the main temp directory. 2021-11-12 17:46:48 +00:00
CalDescent
236a456cae Added support for single file uploads.
This process could potentially be simplified if we were to modify the structure of the actual zipped data (on the writer side), but this approach is more of a "catch-all" (on the reader side) to support multiple different zip structures, giving us more flexibility. We can still choose to modify the written zip structure if we choose to, which would then cause most of this new code to be skipped.

Note: the filename of a single file is not currently retained; it is renamed to "data" as part of the packaging process. Need to decide if this is okay before we go live.
2021-11-12 13:35:50 +00:00
CalDescent
056fc8fbaf Treat a blank identifier as null 2021-11-12 08:59:43 +00:00
CalDescent
4b1a5a5e14 Connected the rest of the system up to the recently added "identifier" feature. 2021-11-11 09:12:54 +00:00
CalDescent
b5feb5f733 Fixed test which was failing due to an earlier commit 2021-11-07 18:41:52 +00:00
CalDescent
991125034e Added "identifier" property to arbitrary transactions
Until now we have been limited to one data resource per name/service combination. This meant that each name could only have a single website, git repo, image, video, etc, and adding another would overwrite the previous data. The identifier property now allows an optional string to be supplied with each resource, therefore allowing an unlimited amount of resources per name/service combination.

Some examples of what this will allow us to do:

- Create a video library app which holds multiple videos per name
- Same as above but for photos
- Store multiple images against each name, such as an avatar, website thumbnails, video thumbnails, etc. This will be necessary for many "system level" features.
- Attach multiple websites to each name. The default website (with blank/null identifier) would remain the entry point, but other websites could be hosted essentially as subdomains, and then linked from the default site. This also provides a means to go beyond the 500MB website size limit.

Not all of these features will exist initially, but having this identifier included in the protocol layer allows them to be added at any time.
2021-11-07 18:39:43 +00:00
CalDescent
09a7fcaba4 Added MissingDataException
This is generated whenever a data resource cannot be built because it is missing data for at least one layer. Using a custom exception type here enables a few new features:

1. A single build process is now able to request missing data from all the layers that need it. Previously it would only request from the first missing layer and would then give up. This resulted in the user/application having to issue the build command multiple times rather than just once, until all layers had been requested.

2. GET /arbitrary/{service}/{name} will now block the response and retry in the background until the data arrives. This allows it to be used synchronously. Note: we'll need to add a timeout.

3. Loading a website via GET /site/{name} will avoid adding to the failed builds queue when a MissingDataException is thrown, which allows it to be quickly retried. The interface already auto refreshes, allowing the site to load as soon as it's available.
2021-11-04 09:09:54 +00:00
CalDescent
3b914d4a7f Improved trade bot backups so that the current order being bought is included.
This should fix any key recovery issues if the node crashes or otherwise fails when buying an offer.
2021-11-03 19:27:56 +00:00
CalDescent
ede4802ceb Converted ArbitraryTestTransaction to version 5
This fixes a failed serialization test when Transaction.getVersionByTimestamp() returns 5
2021-11-03 19:19:27 +00:00
CalDescent
fe79119809 Added PresenceTestTransaction, to allow SerializationTests.testTransactions() to be unblocked 2021-11-03 19:18:26 +00:00
CalDescent
b771544c5d Added test to check website/data updates. 2021-11-02 09:09:54 +00:00
CalDescent
cbb2dbffb9 The /arbitrary/search API endpoint now uses a string instead of an int for the "service", and shows a dropdown of possible values in the API documentation page. 2021-10-31 21:22:51 +00:00
CalDescent
cbed6418e7 Added ability to filter arbitrary transactions by name when searching. 2021-10-31 21:07:14 +00:00
CalDescent
d82da160f3 Added DHT-style lookup table to track file locations
This maps ARBITRARY transactions to peer addresses, but also includes additional metadata/stats to track the success rate and reachability.

Once a node receives files for a transaction, it broadcasts this info to its peers so they can update their records.

TLDR: this allows us to locate peers that are hosting a copy of the file we need.
2021-10-29 13:35:17 +01:00
CalDescent
a55fc4fff9 When validating an ARBITRARY transaction, ensure that the supplied name exists and is registered to the account that is signing the transaction.
This ensures that only the owner of a name is able to update data associated with that name.

Note that this doesn't take into account the ability for group members to update a resource, so this will need modifying when that feature is ultimately introduced (likely after v3.0)
2021-10-25 18:58:33 +01:00
CalDescent
35a7a70b93 Merge remote-tracking branch 'qortal/master' 2021-10-25 18:26:06 +01:00
CalDescent
69e557e70d Delete .sha256 file which was left lying around after running the bootstrap unit tests. 2021-10-25 18:20:58 +01:00
CalDescent
707eb58068 Added testPatchBeforePut() unit test 2021-10-24 22:43:28 +01:00
CalDescent
8630f3be96 Added first end-to-end test of data storage 2021-10-24 19:08:09 +01:00
CalDescent
c222c4eb29 Updated expected hash of demo data as it has been updated. 2021-10-24 17:41:51 +01:00
CalDescent
52a94e3256 Added "validateAllDataLayers" setting (default false)
When true, the hashes of every layer are validated when building a data resource. When false, only the final layer's hash is validated.
2021-10-24 14:37:29 +01:00
CalDescent
a418fb18b6 Hash the current state when creating a patch
This is included in the patch metadata and then validated every time it is rebuilt.
2021-10-24 13:00:21 +01:00
CalDescent
9cd579d3db Another typo 2021-10-24 12:20:49 +01:00
CalDescent
e1a6ba7377 Fixed incorrect comment. 2021-10-24 12:03:22 +01:00
CalDescent
04aabe0921 Include the original file instead of a patch if the patch is larger than the original file.
This saves processing and disk space, as there is no point in applying a patch when the original file is smaller and can be included in its entirety.
2021-10-24 12:02:09 +01:00
CalDescent
8dd4d71d75 Significant rework of patches
- The "diff type" is now specified per file, allowing for different diff methods in each modified file.
- Patches will only be created when both the before and after files are less than 100kiB in size.
- Patches are validated after creation, and if invalid it will fall back to including the entire file.

This has identified a bug where patching fails for files without trailing newline characters, which still needs to be fixed. Until then, it will fall back to including the entire file in these cases.
2021-10-24 10:47:47 +01:00
CalDescent
12b3267d5c Added arbitrary data merge tests. 2021-10-22 11:49:15 +01:00
CalDescent
f3ef112297 Merge remote-tracking branch 'qortal/master'
# Conflicts:
#	.gitignore
#	pom.xml
#	src/main/java/org/qortal/controller/Controller.java
#	src/main/java/org/qortal/gui/SysTray.java
#	src/main/java/org/qortal/settings/Settings.java
#	src/main/resources/i18n/ApiError_en.properties
#	src/test/java/org/qortal/test/CryptoTests.java
#	src/test/resources/test-settings-v2.json
2021-10-15 09:03:28 +01:00
CalDescent
a78af8f248 Added SHA-256 file digest utility methods.
These read the file in small chunks, to reduce memory.
2021-10-09 16:22:21 +01:00
CalDescent
a3dcacade9 Now showing errors directly in the POST /bootstrap/create API response.
This avoids needing to check the log file each time.
2021-10-09 11:02:21 +01:00
CalDescent
a1e4047695 Rework of bootstrap finalization process. 2021-10-08 18:06:41 +01:00