Skip to content

Conversation

@man4j
Copy link

@man4j man4j commented May 12, 2016

I have remote CouchDB with 30_000 docs. I replicateFrom all docs. But(!) replicateTo 'paused' events not trigger. And I see than pouchdb every seconds send _bulk_docs, _local and _revs_diff to CouchDB and I see in CouchDB

{
   "_id":"_local/VfNqh5fhbhdOCfRFn8pEmA==",
   "_rev":"0-675",
   "session_id":"4241BF48-47C3-5446-B3FE-F329644B5F2E",
   "history":[
      {
         "last_seq":675,
         "session_id":"4241BF48-47C3-5446-B3FE-F329644B5F2E"
      }
   ],
   "replicator":"pouchdb",
   "version":1,
   "last_seq":675
}

where last_seq increments by one. After last_seq = 30_000 'paused' events in replicateTo is triggered. But when revert batches.length === 0 'paused' events trigger immediately as expected.

But, I have question number 2: why pouchdb send every second _bulk_docs, _local and _revs_diff to CouchDB?? And i See in CouchDB log:

{
   "_id":"_local/VfNqh5fhbhdOCfRFn8pEmA==",
   "_rev":"0-675",
   "session_id":"4241BF48-47C3-5446-B3FE-F329644B5F2E",
   "history":[
      {
         "last_seq":675,
         "session_id":"4241BF48-47C3-5446-B3FE-F329644B5F2E"
      }
   ],
   "replicator":"pouchdb",
   "version":1,
   "last_seq":675
}

where last_seq increments by one ??? I'm not create any docs locally in pouch. All docs was replicated from remote CouchDB. Oh... my head... Thanks!

I have remote CouchDB with 30_000 docs. I replicateFrom all docs. But(!) replicateTo 'paused' events not trigger. And I see than pouchdb every seconds send _bulk_docs, _local and _revs_diff to CouchDB and I see in CouchDB {"_id":"_local/VfNqh5fhbhdOCfRFn8pEmA==","_rev":"0-675","session_id":"4241BF48-47C3-5446-B3FE-F329644B5F2E","history":[{"last_seq":675,"session_id":"4241BF48-47C3-5446-B3FE-F329644B5F2E"}],"replicator":"pouchdb","version":1,"last_seq":675} where last_seq increments by one. After last_seq = 30_000 'paused' events in replicateTo is triggered.  But when revert batches.length === 0 'paused' events trigger immediately as expected.

But, I have question number 2: why pouchdb send every second _bulk_docs, _local and _revs_diff to CouchDB?? And i See in CouchDB log: {"_id":"_local/VfNqh5fhbhdOCfRFn8pEmA==","_rev":"0-675","session_id":"4241BF48-47C3-5446-B3FE-F329644B5F2E","history":[{"last_seq":675,"session_id":"4241BF48-47C3-5446-B3FE-F329644B5F2E"}],"replicator":"pouchdb","version":1,"last_seq":675}  where last_seq increments by one ??? I'm not create any docs locally in pouch. All docs was replicated from remote CouchDB. Oh... my head...
@man4j
Copy link
Author

man4j commented May 12, 2016

I got it. batch_size not work when processPendingBatch(changesOpts.live); And in reverted variant batch_size default = 1, but I may change this option and it's work.

@daleharvey
Copy link
Contributor

@man4j Hi, thanks for the PR, it looks like there are failures in this fix, I am having a little problem understanding what your bug is.

You say you see all_docs etc requests and paused events not firing, are you saying these requests fire after all 30000 docs have replicated? if you could provide a code snippet or ideally working code that exhibits the failure that would be super useful, thank you

@man4j
Copy link
Author

man4j commented May 17, 2016

@daleharvey At this moment bug in batch_size parameter. In current version batch_size = 1 and not configurable ((( Please fix this bug. ReplicateTo method very slow, because batch_size = 1 and it's send many requests to CouchDB. But my fix maybe is not fully correct, but batch_size is work (but default whatever = 1, but I may configure it).

@daleharvey
Copy link
Contributor

In the changes feed the default batch_size is 25 and can be configured https://github.com/pouchdb/pouchdb/blob/master/src/adapters/http/index.js#L712, in replication the default batch size is 100 and also configurable https://github.com/pouchdb/pouchdb/blob/master/src/replicate/replicate.js#L23

Can you clarify the problem in terms of the behaviour you are seeing, is the entire problem that replication is too slow / using lots of requests? or are you seeing functional bugs? again example code would be really useful

@man4j
Copy link
Author

man4j commented May 17, 2016

@daleharvey Ok, this is my example:

<!DOCTYPE html>
<html>
<head>
    <title>Pouch test</title>

    <meta charset="utf-8">
</head>

<body class="background-dark">
    <script type="text/javascript" src="lib/pouchdb-5.3.2.js"></script>
    <script>
        var db = new PouchDB('test_db', {adapter: 'websql'});

        db.destroy().then(function(){
            var db = new PouchDB('test_db', {adapter: 'websql'});

            var remote_db = new PouchDB('http://kaizen-retail.com:5984/test_bug_db');
            db.replicate.from(remote_db, {
                live: true,
                retry: true,
                batches_limit: 10,
                batch_size: 200
            });
            db.replicate.to(remote_db, {
                live: true,
                retry: true,
                batches_limit: 10,
                batch_size: 200
            });
        });
    </script>
</body>
</html>

Very easy, right?

And if i open couchdb log (http://kaizen-retail.com:5984/_log?bytes=10000) i see 2 interesting urls:

GET /test_bug_db/_local/7ERAWapZlPnVyIZ6T5LL8w%3D%3D? 200
GET /test_bug_db/_local/Hulxcu.C6Wqz_UbINSWOoA%3D%3D? 200

And i copy this urls in my browser, and i see:

By first url: (replicateTo)
{"_id":"_local/7ERAWapZlPnVyIZ6T5LL8w==","_rev":"0-198","session_id":"349620D1-053E-900C-B604-390F1D006686","history":[{"last_seq":198,"session_id":"349620D1-053E-900C-B604-390F1D006686"}],"replicator":"pouchdb","version":1,"last_seq":198}
"last_seq":198 incremented one by one and batch_size option is ignored!! Regression since 5.2.1!

By second url: (replicateFrom)
{"_id":"_local/Hulxcu.C6Wqz_UbINSWOoA==","_rev":"0-145","session_id":"6036077D-0E30-3E15-829B-2388F7C2CAEE","history":[{"last_seq":29000,"session_id":"6036077D-0E30-3E15-829B-2388F7C2CAEE"}],"replicator":"pouchdb","version":1,"last_seq":29000}
"last_seq":29000 incremented by 200 and its ok

Please, fix this bug.

@dharders
Copy link
Contributor

I have the exact same issue. See #5167

I didn't notice this pull request until just now... your fix definitely resolves my issue.

Check out the commit that introduced this issue. Specifically the test...

I'm a little suspicious of the comparison added:
numChanges += c.docs_written; if (numChanges === 3)

Is c.docs_written always 1 ? If not, is it possible that we could skip numChanges === 3, by say adding 2 + 2 ? Should we have a >== catch all ?

@dharders
Copy link
Contributor

Also, it appears you are just resetting this commit without updating the tests that were committed with that 1 line of source change.

@daleharvey
Copy link
Contributor

Awesome that is more info to try and get to the bottom of this, for some more discussion about what prompted the original commit, #4753 is the PR and #3824 is the original issue

@dharders
Copy link
Contributor

Looking at PR #4753 , processPendingBatch(false) is needed for non-continuous replications.

So could we simply change:

processPendingBatch(batches.length === 0)
to
processPendingBatch(batches.length === 0 && changesOpts.live)

to satisfy both continuous (i.e. { live: true } ) and non-continuous replications ?

@man4j Could you try this and see how the tests go ?

@man4j
Copy link
Author

man4j commented May 19, 2016

Yes, it's fixes a bug. I'm very happy! Thanks!

@nolanlawson
Copy link
Member

This PR is failing for the retry test.

@nolanlawson
Copy link
Member

Well, I tried to replicate this issue in a unit test, and I admit I'm unable to: f289d50.

However, I did try out @man4j 's test case, and I can agree that there are serious performance problems with replicating from the given CouchDB server, which I am unable to replicate locally because it probably involves failed requests across the network (and hence I cannot repro with localhost).

What I'm seeing a lot of is this:

screenshot 2016-05-21 19 23 15

I.e. the browser is being starved because we're making many many simultaneous GET requests, of which the majority are canceled. Then it tries again. All the while, the UI is running very slowly because the concurrent Ajax requests are starving the UI thread.

I think this is another issue that requires its own fix (throttle network requests in the bulkGetShim(), although it doesn't address the issue raised in this PR. I'm still unable to find a test that 1) fails in master, but 2) succeeds with the proposed fix (i.e. #5180).

@nolanlawson
Copy link
Member

nolanlawson commented May 22, 2016

#5195 attempts to address the starvation issue, but we'd still need a repro for the changesOpts.live issue.

@nolanlawson
Copy link
Member

closing in favor of #5196, please move discussion over there

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants